Learning Space dedicated to
the Art and Analyses of Film Sound Design
What's new?
Site Map
Site Search
Sound Article List
New Books
What is


Audiovisual poetry or Commercial Salad of Images?
- Perspective on Music Video Analysis

By Sven E Carlsson

Music video is a many-faceted multi-discursive phenomenon. Some generally acknowledged "facts" about music video are that ...

i) music videos communicate through TV-screen and TV-speakers
ii) music videos are a form of low-brow popular culture
iii) the reception of music videos depends on the beholder (music video may be beautiful or ugly, art or trash, etc.).

Michael Shore (1984: 98–99) concludes that Music video is

recycled styles … surface without substance … simulated
experience … information overload … image and style scavengers … ambivalence … decadence … immediate gratification … vanity and the moment … image assaults and outré folks … the death of content … anesthetizationof violence thorough chic … adolescent male fantasies … speed, power, girls and wealth … album art come to turgid life … classical storytelling’s motifs … soft-core pornography … clichéd imagery …

>> more

Many disparate approaches are possible when music videos are being dissected. One of the most common methods of analysis is to break up the music video into black and white boxes. Almost everything is then perceived as opposites – trash or art, commerce or creativity, male or female, naturalism or antirealism, etc.

When this method is used on music videos in general, videos fall into two rough groups: performance clips and conceptual clips. When a music video mostly shows an artist (or artists) singing or dancing, it is a performance clip. When the clip shows something else during its duration, often with artistic ambitions, it is a conceptual clip.

Music video artist as a "modern mythic embodiment"
I have developed the following mythical method of analysis, which I call "modern mythic embodiment" . Viewed from this perspective the music video artist is seen as embodying one, or a combination of "modern mythic characters or forces" of which there are three general. The music video artist is representing different aspects of the free floating disparate universe of music video.

In one type of performance, the performer is not a performer anymore, he or she is a materialization of the commercial exhibitionist. He or she is a monger of their own body image, selling everything to be in the spotlight – selling voice, face, lifestyle, records, and so on. This commercial exhibitionist wants success and tries to evoke the charisma of stardom and sexuality, he or she wishes to embody dreams of celebrity, to be an icon, the center of procreative wishes.

Another type of performance in the music video universe is that of the televised bard. He or she is a modern bard singing banal lyrics using television as a medium. The televised bard is a singing storyteller who uses actual on-screen images instead of inner, personal images. Sometimes the televised bard acts in the story – sometimes he or she is far away and inserted images help him or her tell the story. The greatest televised bards create audio-visual poetry. They transform the banal story of the lyrics employing on-screen images to create a story about life and death. Too often, however, the televised bards only contemplates her or his own greatness and unfulfilled wishes.

The third type of performer is the electronic shaman. Sometimes the shaman is invisible and it is only her or his voice and rhythm that anchor the visuals. He or she often shifts between multiple shapes. At one moment the electronic shaman animates dead objects or have a two-dimensional alter egos (as in cartoon comics), seconds later he or she is shifting through time and so on. The electronic shaman is our guide on a spiritual journey through blipping images and magical attributes. And the electronic shaman promises that there is a hidden meaning in everything; he or she promises that we live in a magical, mythical reality. The electronic shaman’s voice and rhythm form the life-line that connects images and sound simultaneously creating new experiences and associations for those involved in the conscious-streaming journey outside time and space. The electronic shaman's performance, and the other two types of performance, can be seen in Cher's music video Believe (1998)

Analyzing Cher’s Video Believe

By using the method of "modern mythical embodiment" it is possible to view Cher as an electronic shaman in Believe: she is a modern sorceress using the electronic magic of visual special effects. The spiritual journey begins with concentration, her eyes glow in the dark, the fog rises, and people walk in a slow-down, dreamy way, etc.. The spiritual journey concludes with a magical green surrounding Cher as a soul exchange happens between the young woman and her..

Cher also uses the electronic magic of aural effects (vocoder) on her voice to tell the audience about her own, and maybe also her younger soulsister’s unearthly pain. Cher’s voice and the "synthesized" modern dance rhythms anchor the dreamlike journey. It is also possible to see Cher as a televised bard, singing a story about life after love. An unhappy girl in the discotheque watches her ex-boyfriend against a backdrop of happy dancing people. Cher is a singing story-teller who visits this narrative world. She actively participates in the story only at its conclusion, when she changes places with the young girl.

In Believe Cher also promotes her record and her audiovisual style. She scavenges on feelings of teenage unhappiness, using these feelings as a commercial commodity. Cher aims to evoke the charisma of stardom and sexuality. Older now, she takes advertising help from the images of young healthy bodies. Thus she can be also seen as commercial exhibitionist.

Believe (1998)

Standard clip with song performence and visual narration

Audiovisual Analysis of Music Video
Clearly, production of meaning in music videos is complex, compromised of several flows of audio-visual information. These flows interact and the resultant meaning is perceived as one complete whole, created by both the ears and eyes. To illustrate these information flows in a particular moment I use a model of audio-visual analysis that I have developed. The model is a "crude map" which points out how three aspects of the audio-visual flow – music, image, and text – interact producing meaning for "literate" audio-viewers (cf. Chion 1994) of music video. Sound effects are not as common in music video as in ordinary TV-programming, yet when they do appear, they are often are used at the video's beginning or end.

Music video is a form of audio-visual communication in which the meaning is created via carriers of information such as; (1) the music, (2) the lyrics and (3) the moving images.

The Music
The music video is composed by adding images to music.
The music video director creates moving pictures for an already existing tune. A music video lacking a coherent narrative based on visuals and lyrics uses unifying aspects which are a distinctive trait of music. Images are bound together by the beat and other musical features.

Sometimes the musical elements shape of the moving pictures. Movements like footsteps are often synchronized with the beat, so that people in the music video seem to walk in synch to the music.
Melodic phrases can also be visualized by tilting the camera verticality to match the musical phrase's up and down travel on the scales. In Cher’s music video Believe there is a synthesized cymbal-rattle sound which is sometimes synchronized with effects of lightning, and which sometimes amplifies body movements.

The Lyrics
Lyrics and images interact creating meaning.
In many music videos a new meaning is added to the banal lyrics through metaphorical language, often with a amusing twist. When presented well, the concurrence of lyrics and text opens a dimension that an create a poetic experience. The greater the leap between the content of the lyrics and the imagery in this metaphorical joining, the more difficult it becomes for viewers to understand and interpret the context. The opposite of the metaphorical joining of lyrics and images occurs when the illustration to the lyrics are simply illustrated by the visual imagery. For instance, if a dog is mentioned in the lyrics, we see a dog on the TV-screen, if a child is mentioned, we are shown a child. Like a salad of images where the visual story is missing, the story is carried by the music and lyrics and not by an independent visual story.

In the first verse of Cher’s Believe to phrases "I can’t break through" and "so sad" there is a kind of visual echo made by special effects. I interpret this feature as a text-image metaphor. A text-illustration appears for example when the girl sits drinking as the lyrics intone "Sit around and wait for you". Some clichés in the video are quite amusing. For example, Cher has military pants when she is in the third verse when she forcefully repeats "I don’t need you anymore". A standard gimmick in film-making wherein the environment is made to mirror the feelings of the leading characters. The same kind of effects also occurs when the unhappy girl climbs to the rooftop and the rain of tears pours from the sky.

The Image
The visual form is close to the musical form.
To begin the analysis of the image the basic ideas behind the footage must be discerned in order to identify the key concept behind the video. By manipulating color, motive setting, story footage, clothing and so forth, the music video director creates a couple of ideas which are repeated and varied. The concept is to rearrange visual motifs so that the work forms a whole. The concept behind Believe is that Cher is a mystic singer who comments on a girl meeting a boy. The boy rejects the girl, he leaves with another girl.

In the video there are two main visual motifs: one is Cher’s performance track, the other is the narrative track about the unhappy girl. Visual motifs are lightning effects from the discotheque and especially the light that is traveling between the girl and Cher.

Yet the concept does not have to consist of visual motifs: it could be a short silent movie accompanied by background music. A good example of this concept is Bruce Springsteen's I'm on fire (1986), in which a "grease monkey" falls in love with a female customer. The mechanic drives the customer's car to her house, but he lack the nerve to ring her doorbell and he walks away alone

I'm on fire (1986)
I’m on fire is a pure narrative clip

Perceiving Music Videos
There seems to be different layers of perception when a human being is audio-viewing a music video. Interacting layers of perception may be instinctive, inter-subjective and individual, which in turn activate social aspects such as family, peer group, region, country, language etc. When these different layers interact with unique personal memories and instinctive behavior the analysis of music video becomes complicated. Situation variables occur frequently and they are often the content. My experience as an educator in music video suggest that many young people are unable to view a music video if they dislike the artist or the music.

To give a simplified explanation, music video pictures can be a interpreted as a merging of three traditions of moving images: singing performance, visual story-telling, and the non-narration of modern art

The cinematic tradition of singing performance is as old as the first motion picture with sound - when Al Johnson sang Oh Mama 1927 in The Jazz Singer. From then this type of performance has continued in promotion pictures, musicals and concert documentaries. The basic formula behind the filmed performance is to take a popular singing performer and place him or her in a setting either literally suggested by the song’s lyrics or in one that mirrors the escapist pleasantries common to movie musicals.

Visual story-telling has developed from the early days of film-making into the film language of today. The basic rules of visual narration make it easy to follow a story as in a TV-soap: an outside shot on a window, inside shot presenting the room, shot of a man and woman, close shot on a face showing grief, etc. If it is possible to follow a filmed story without written text or aural cues (in speech, music or sound effects) then the film maker has used the grammar of visual narration.

In opposition to traditional visual narration, the non-narration of modern art has been developed by the representatives of the 20th-century art forms and created experimental movies like Fernand Leger and Dudley Murhpy’s Ballet Mecanique (1924) and Oscar Fishinger’s Composition in Blue (1934). These experimental achievements are nowadays a part of standard music video narration technique. In a music video clip collages, paraphrases, animated abstract art, computer graphics, and unexpected combinations of pictures may appear. The chock aesthetics in music videos can, from this point of view, be interpreted as a combination of the provocative modern art tradition and a cultural interpretation of the teenage rebellion.

Ballet Mecanique (1924)
Visual abstract form

Composition in Blue (1934)

Animated abstract art

Visual Music Video Styles

Standard Clip
A standard clip, as I call it, is a music video that contains more or less these three visual traditions: a filmed singer is blended with inserted images and the presentation is artistically influenced by the experimental film tradition. Queen’s influential Bohemian Rhapsody (1975) provided a model of a standard music video and has all these narrative traits.

The same goes for Robert Miles' One and One (1996). The video exposes the vocalist, and the shots and the editing contain a passable portion which emphasizes strange artistic features. There is a kind of narrative inserted in the video with highlights of failure and dreams. Even Cher’s Believe is a standard clip. In relation to the visual traditions, the video is placed in the traditions of performance and visual narration.

The concept of the standard clip is dynamic and has many variations. The vocalist might
actively participate in the story while simultaneously standing outside the video, offering self-reflexive commentary; he might have a singing alter ego, for instance a cartoon character; or, he might change clothes between cuts, jump around in time, shift his shape, fly, float, etc

Bohemian Rhapsody (1975)

One and One (1996)
Standard clip with instrumental and song performances and visual narration

There are three pure forms of visual tradition in music video: performance clip, narrative clip, and art clip.

Performance Clip
If a music video clip contains mostly filmed performance then it is a performance clip. A performance clip is a video that shows the vocalist(s) in one or more settings. Common places to perform are the recording studio and the rehearsal room. But the performance can take place anywhere, from the bath tube to outer space. Walking down the street is another performance cliché, which is common in rap videos.

The performance can be of three types: song performance, dance performance and instrumental performance. Almost every music video includes song performance. Some videos combines song and dance performances. Michael Jackson’s videos often contain dance performance. Instrumental performance is not so common, but it occurs occasionally. Concert performance on stage with audience is so common that it has its own category, the concert clip.

Narrative Clip
If a music video clip is most appropriately understood as a short silent movie to a musical background, it is a narrative clip. A narrative clip contains a visual story that is easy to follow. A pure narrative clip contains no lip-synchronized singing. Bruce Springsteen’s I’m on fire is a pure narrative clip.

Art Clip
If a music video clip contains no perceptable visual narrative and contains no lip-synchronized singing then it is a pure art clip. The main difference between a music video art clip and a contemporary artistic video is the music. While the music video uses popular music the artistic video uses more modern, experimental music, such as electro-acoustic music.

The connections between the music genre and the visual genre of music video are weak: when you listen to a new record you may know the genre of the music but seldom do you immediately know how the moving image will realized. There are, however, some connections. Dance music video clips are sometimes art clips. The editing technique in soft ballads is mostly mixing. The hard rock music genre usually features concert clips with inserted narrative shots.

Final Countdown (1994)

Europe's Final Countdown is mostly a Concert clip

Abstract form
The repetition of images in a music video resembles mostly the form of music, and the formal principle may thus be called abstract form. Music videos are often organized around what we might call "theme and variations". This term applies to music in which a melody, or, motif, follow

Music video works in a similar fashion using pictorial elements that function thematically. Pictorial elements are the small parts into which a moving image may be divided in - quick zooming in and out, short cuts, colors, shapes, movements, settings, clothes, footage etc. The abstract qualities of these pictorial elements assembled in thematic combinations create form. An introduction often presents the basic pictorial elements, which will develop to visual motifs.

For an example, in the first 20 seconds of Lisa Stansfield’s So Natural (1993) pictorial elements such as colors (orange, blue and natural), flowing water, and camera positions develop to visual themes.

Shot 1

Shot 2


Shot 6

So Natural (1993)
So Natural is a Standard clip with almost no visual narration

>> Read more about Abstract form in "Principles of Film Form" in Film Art: An Introduction with Tutorial CD-ROM

What is music video?
Andrew Goodwin (1993: 3) lists these theoretical conclusions as the basis of understanding music videos:

cinematic genre -- advertising -- new forms of television
-- visual art -- "electronic wallpaper" -- dreams -- post-modern texts -- nihilstic neo-Fascist propaganda -- metaphysicalpoetry -- shopping mall culture -- LSD -- "semiotic pornography" --.

>> More

Scholars may sometime know too much and refer to their own perspective and specialty. Music video has its own conventions which do not necessarily follow those of cinema. For example, backlightning effects are often interpreted as a reference to film noir, although a more reasonable association is with the lightning conventions of live performance.

In music video, narrative relations are highly complex and meaning can be created from the individual audio-viewer’s musical personal musical taste to sophisticated intertextuality that uses multidiscursive phenomena of Western culture. For example George Michael’s video Killer/Papa Was A Rolling Stone(1993) illustrates the words of the lyrics with logotypes of well known consumer products. This feature plays on another form of intertextuality – illustrations of important words in the song lyrics is quite common in music video, especially rap videos. Thus the new words in the form of commodities’ logotypes become an innovative way to reuse this tradition. Some audio-viewers may identify with the commercialization of human feelings signaled by the video.

Killer/Papa Was A Rolling Stone (1993)

Killer/Papa Was A Rolling Stone is an Art clip with inserted song performence (in extreme close up)

The music video Killer/Papa Was A Rolling Stone also stretches the audiovisual texture – the video's music track is a live concert with audience sounds; but the video contains no shots from the concert, only a few big close-ups on George Michael’s mouth synchronized with the music track.

I believe that music video is today its own form of art with its own traditions. I hope this form of art will get its own theory – not just old theories "amputated" to fit music video. I have tried to point to possible directions for studying music video.

Music video is not inherently uninteresting, trivial and dumb. Music video is interesting and fun; it always offers something to look at and something upon which we need to reflect


Andrew Goodwin (1992) Dancing in the Distraction Factory: Music Television and Popular Culture

Bordwel, David & Thompson, Kristin (2006) Film Art: An Introduction with Tutorial CD-ROM

Chion, Michel(1994) Audio-Vision: Sound on Screen

Shore, Mikael:(1985) The Rolling stone book of Rockvideo

Art films:

Ballet Mecanique Fernand Leger and Dudley Murhpy (1924) >> Wikipedia

Composition in Blue Oscar Fishinger (1934)

Music videos:

CHER - Believe (1998) Dir: Nigel Dick
>> lyrics

EUROPE - Final Countdown (1994) Dir Arxel
>> Lyrics

GEORGE MICHAEL - Killer/Papa Was A Rolling Stone (1993), Dir: Marcus Nispel
>> Lyrics

ROBERT MILES - One and One (1996), Dir Michael Geoghegan
>> Lyrics

BRUCE SPRINGSTEEN - I'm on fire (1986) Dir. John Sayles
>> Lyrics >>Stills

LISA STANSFIELD - So Natural (1993) Dir. Marcus Nispel
>> Lyrics

QUEEN - Bohemian Rhapsody (1975), Dir: Bruce Gowers
>> Lyrics

This article was originally published in Muskiikin Sunta nr 2 1999 Special issue in English on Music videos, The Finnish Society for Ethnomusicology, University of Helsinki, Finland




>> Study how lyrics and images interact to create meaning

>> Music videos with strong conceptual ideas

Find Music Video Articles

Clipland Catalog is a textual database on background information such as executing director, director of photography, key crew, locations and technical format.

Music Video Book Store




Star Wars Sounds Film Sound Clichés Film Sound History Movie Sound Articles Bibliography
Questions & Answers Game Audio Animation Sound Glossaries Randy Thom Articles
Walter Murch Articles Foley Artistry Sci-Fi Film Sound Film Music Home Theatre Sound
Theoretical Texts Sound Effects Libraries Miscellaneous