The corpus

  1. Systematic description
  2. Inclusion guidelines
  3. Quantitative description: Statistics and selected sub-corpora
  4. Sources
  5. The HyperHamlet text

1. Systematic description

HyperHamlet is a database of quotations from and allusions to Hamlet. The collection of data began in 2002/03 with a research seminar on "Hamlet's Presence". Since then, the pioneer corpus of references, mostly determined by students’ interests, has been revised and enlarged by the systematic research of the Basel team of linguists and literary scholars and by suggestions from the public domain. The Project history gives details.

Data collection is based on three main sources: electronic searches, extant secondary literature and serendipitous reading. Individual finds as well as systematic searches give the corpus its richness. Online databases such as Literature Online or the British National Corpus as well as annotated editions of fictional or non-fictional works by various authors have been searched systematically.

HyperHamlet is a specialized corpus. The choice of material is influenced by the prevalence of references in the domain of "highbrow" literature (including literary journalism) and popular culture. Accessibility is another conditioning factor: inevitably, digitally searchable texts constitute the most important source. HyperHamlet is therefore not what is generally called a representative corpus. Nevertheless, HyperHamlet aims at versatility and size to constitute a resource for researchers also beyond literature. References from many non-fictional genres, but also music and the visual arts, are admitted to the corpus.

All entries are searchable for bibliographical data, language, reference type (such as references to specific lines, characters or motifs and genre). Further codings include the modification patterns in quotations, marking for quotation, marking for derivation, textual function and the intertextual relation between the quoted and the quoting text as a whole. For details consult Search options.

2. Inclusion guidelines

The formal categories of the HyperHamlet database are geared towards intertextual references in the form of quotations, i.e. short Hamlet extracts occurring in longer texts that have no general connection with Shakespeare's play.

It is not always easy to draw the line between verbatim quotations such as "a little more than kin and less than kind" and shortened or collocational usage in variants such as "less than kind". Such doubtful or "weak" quotations are included if the Hamlet phrase fulfills at least one of the following criteria:

These "weak" entries can be included in any search of the database by setting the STATUS menu to Complete collection; the default option Core corpus filters them out. The same goes for the small number of "non-quotation" entries that derive from texts older than Hamlet. These antedating texts represent a sample of possible sources for the play.

The database does, however, also accommodate other kinds of relationship between Hamlet and later cultural productions; not all of Hamlet's afterlife can be adequately described as "quotation".

Further kinds of data could be accommodated by the database but are not included: firstly, documents relating to historical performances of the play (as opposed to the fictional stagings of Hamlet which are currently documented among the literary references) and, secondly, critical comments of any kind, which Gérard Genette calls "metatextual" rather than "intertextual" or "hypertextual" references.

3. Quantitative description


In 2010, at the conclusion of the period of funding by the SNF, the corpus included 9000 entries by about 3500 authors. 2300 of them are represented by one single entry; 80 have between 10 and 25 entries.

Most of the references are in English, followed by 5% in German and 3% in French, Russian and about 20 other languages. Also in this regard, the corpus is not truly representative but does document the cross-cultural relevance of Hamlet.

The ratio between line references and references to motifs, scenes and characters is 4:1. This is attributable to the search strategies (thematic allusions are harder to detect in digital sources than lexical surface matches) but also to the afterlife of many phrases which have come to be used independently of Hamlet in a wide variety of nonliterary contexts.

References from works of fiction cover about 55% of the corpus (30% prose, poetry and songs 11%, drama including film and opera 13%). References taken from non-fictional sources amount to 43%, while the rest derive from visual, musical and mixed-media sources and from small verbal forms such as anagrams, brand names or anecdotes.

Selected sub-corpora

References from the works of inveterate quoters have been collected systematically, using secondary literature and annotated editions. The famous heavyweights are William Hazlitt (329 references), James Joyce (155), Charles Dickens (193) and Sir Walter Scott (254), followed by Samuel Taylor Coleridge (82), Lord Byron (95), Dorothy L. Sayers (71) and the German 19th-century novelist Wilhelm Raabe (72).

The persistent Hamlet quotations in minor Romantic and early Victorian writers are a new discovery owed to electronic searches. The results include 75 Hamlet references in the works of the radical novelist William Godwin as well as between 30 and 60 quotations each by Charlotte Turner Smith, Francis Lathom, Rosina Doyle Bulwer-Lytton, Susan Ferrier, Leigh Hunt, Hester Lynch Piozzi, Edgar Allan Poe and Thomas Love Peacock.

Hamlet is also popular with genre fiction writers, witness Hyperhamlet's collections of references in crime fiction (400 entries), romance (50), Gothic (90) and science fiction (60).

A small number of quotation dictionaries have been harvested for Hamlet references. Entries from older dictionaries and anthologies such as John Cotgrave's English Treasury of Wit and Language (1655) or Edward Bysshe's Art of English Poetry (1762) are of particular historical interest for charting the development of Shakespeare's popularity as a phrasemaker.

All the 17th-century references noted in the 1932 Shakspere Allusion Book are recorded. This collection lists references to Shakespeare and his works up to 1700 and forms the core of our early corpus. All in all, HyperHamlet currently contains 318 17th-century references to Hamlet or recyclings of phrases from the play, of which 171 are not – to our knowledge – mentioned in previously published research.

A limited number of entries are taken from texts older than Hamlet. Such possible sources for the play represent an interesting sample of the many phrases and sayings which Shakespeare himself recycled and which he made famous through the particularly memorable form that he gave them. Moreover, dozens of expressions in Hamlet which have become known “Shakespeare quotations” are in fact based on the Bible or on proverbs or clichés which were generally current in Shakespeare's time. These entries can be found by typing "bible", "proverb" or "cliché" into the search field TEXT.

4. Sources

The published research that has been mined for hints of Hamlet references and allusions is listed in the Research bibliography. Roughly half the entries are not – to our knowledge – mentioned in previously published research but derive from systematic electronic searches in fulltext databases and free Web content etc. as well as the private reading of project staff. Individual contributions are regularly uploaded by other Hamlet-spotting contributors; all these suggestions are checked, formatted and annotated before becoming publicly available entries in the database.

5. The HyperHamlet text

Hamlet exists in no less than three 'Ur'-texts that can claim independent authority to different degrees, and which are used for critical editions. HyperHamlet, however, uses a basic text that reflects not so much the latest scholarly insights as a long-term consensus of use. A database which documents centuries of literary and popular quotation, should ideally, record all variants and emendations that have affected the meaning and the status of the play, without – and this is crucial – privileging one version over others, as this would bring back the "dream of the master text".1 So the text should in no way suggest any claim to privileged status or have revisionist aspirations (as do the 1986 Oxford Shakespeare, and, to a lesser extent, The Norton Shakespeare). The decision to use a text widely available on the internet, was therefore not a decision for "the best text available", but for the least bad for the purposes of this project. The Moby Shakespeare has the added advantage of not being covered by copyright.

1 The Norton Shakespeare, based on the Oxford Edition. Ed. Stephen Greenblatt et al. New York and London: W.W. Norton, 1997. 65.