In thematic groups of 3-4, outline an example corpus tagset that fits the common theme of your group (i.e. envisage a corpus that would be useful to all members or at least have all members contribute an idea) and then use it to create a corpus document. You will find an example corpis docutment on the FTP.
Follow these steps:
- establish the common theme of your hypothetical corpus (e.g. learner corpus, corpus of metaphors, parallel translation corpus, conversation analysis/pragmatics corpus, contrastive analysis corpus, error analysis corpus, etc.);
- consider what the tags of your mark-up will/would/should be;
- propose some attributes for the tags;
- propose some values for the attributes;
- consider what kind of metadata to include in your document;
- find a short text and create an example corpus document in XML format (any dialect) using your annotation scheme;
- note any and all problems that you encounter (in order to discuss them later in class);
- submit your document.
- the final deadline for this assignment is 11/12/2017;
- only .xml documents will be accepted (any number of editors can be used for this purpose, including, but not limited to, the free Notepad++ or the paid oXygen XML);
- your document must contain an explicit header and explicit body;
- this is a teamwork exercise; hence, any group smaller than 3 or larger than two will receive a slight penalty (-10%) for every participant that deviates from the norm; the only exception that applies to this rule is the instance wherein we ran out of people due to group asymmetry;
- there will obviously be a difference in proportions between corpus-based/corpus-driven and corpus-informed studies, so be sure to indicate what kind of study is your hypothetical corpus intended for;
- when testing your annotation scheme, I will attempt to perform query searches using your tagset (as appropriate, considering your group’s theme).