Data Sci. Discussions: Cross-lingual frameworks and natural annotation of crowdsourced online dictionaries
- Note: There is an alternative view of this document available on Google Colab. This is functionally very similar to shared Google docs; offer comments where necessary.
See the original iteration/outline of this language study here
Quick overlook
In the realm of cross-lingual or multilingual natural language processing (NLP), it does appear that some languages tend to get more attention. Typically well-established languages, with a plethora of linguistic resources to build from, often thrive while low-resource creole languages are left to languish. Creoles are a particular challenge of the NLP community as these languages often arise from often frantic and urgent needs to establish harmony in communication within cacophonous settings. In the Caribbean, creole languages often emerged as a means of survival, and endured as a result of resilience; often speakers struggled to work within the bounds of a dominant prestige language while retaining unique traces of their heritage languages or contact languages. The resilience of this language is currently being tested with the advent of the SARS-CoV-2 pandemic threatening the lives of older language-keepers. Therefore, the careful creation of tools and frameworks is needed to facilitate society’s creole language needs.
Yet, it is this very discordant origin and complexity of structure that presents issues to the preservation of creole languages. For example, most words in Saint Lucian Kwéyòl/creole focus on emotions, the weather, and other aspects of the natural environment, including animals and food sources (Frank, 2007); therefore, finding domain-equivalent literature sources outside of certain contexts can be challenging. Moreover, it is said that just over 83% of vocabulary words have French origins; roughly 3% is English-based, and Amerindian, African, and East Indian sources account for about ½ % of the total each (Frank, 2007). Even the author of the official creole dictionary acknowledged gaps in its vocabulary was due to the lack of official etymological details of over 11% of documented words (Crosbie et al., 2001; Frank, 2007). Despite the advantage of cross-referencing parallel language data sources, the language challenges are made more complex by many of the vocabulary words lacking details on their origins. Ultimately, the situation could be described as a “Mondrian-like” language setting. This image seems apt to explain this low-resource creole being close to parallel and monolingual data with high-resource languages (like French and English), yet the present language data may belong to different domains (missing reference). Not much NLP research has been done on adapting its components to creole languages, however, the academic tide has been changing as of 2020 (Soto, 2020).
There are multiple ways to utilize NLP functions. The increasing popularity of opinion-rich resources such as blogs (Joseph, 2020), shopping websites, review portals, and social media platforms (Joseph, 2020) are rapidly attracting business people, governments, and researchers alike. Sentiment analysis often garners high interest from businesses. However, there are challenges present in natural language processing, particularly when dealing with sentiment analysis, word-sense disambiguation, and issues with dependency parsers in cross-lingual settings. A major challenge of lexical semantics is creating training data and algorithms that facilitate downstream tasks. Moreover, it has also been said that while there is a lingering ‘bias towards contemporary Indo-European languages’, treebanks for other language families and treebanks for classical languages are on the rise (Nivre et al., 2016). It has been suggested that categorizing the challenges and formalizing their interpretation using Universal Dependencies may help create a Saint Lucian Kwéyòl dependency treebank, and later facilitate other needed NLP tasks (such as sentiment analysis of texts). A Saint Lucian Kwéyòl parser may be constructed by leveraging the base knowledge of French syntax. However, these may not be the only tools required to tackle the challenges of digitizing and analyzing creole languages. Therefore, it may be helpful to develop a framework to improve the models for linguistics. Perhaps an extendable framework that can demonstrate its application to low-resource creole languages.
My current overarching scholarly inspirations deal with universal dependency parching, POS Tagging, and possibly sentiment analysis. Some works have suggested that the lexical deviations inherent to creoles can be accommodated by applying the existing language (like French or English) dependency relation definitions while ensuring consistency with the annotations in other non-French or non-English Universal Dependency (UD) treebanks (Wang et al., 2020). The document ‘From Genesis to Creole Language: Transfer Learning for Singlish Universal Dependencies Parsing and POS Tagging’ (Wang et al., 2020), is my main point of reference for now. This is due to the relevance of the topic, and the creativity of the work. Additional considerations may be given to a paper that focuses on ‘NLPyPort’; the authors discuss an ‘NLP pipeline in Python, primarily based on NLTK, and focused on Portuguese’ (Ferreira et al., 2019). This was said to be constructed, in part, from ‘pre-existent resources or their adaptations, but improves over the performance of existing alternatives in Python, namely in the tasks of tokenization, PoS tagging, lemmatization, and NER’ (Ferreira et al., 2019).
I do also acknowledge that I do need to delve into the discussion of treebanks and their creation. Since I am dealing with a creole that is mostly french based, it might be wise to review the works of Anne Abeillé. Her most relevant works might be ‘Treebanks: Building and using parsed corpora’ (Abeillé, 2012), ‘Building a treebank for French’ (Abeillé et al., 2003) and ‘Enriching a French treebank’ (Abeillé & Barrier, 2004).
Finally, I just wanted to highlight my three sources of inspiration concerning meaningful social and linguistic projects; these were the works of Abraham Maslow, Luisa Maffi, and Dr. Bruno Gonçalves. Consideration of Maslow’s theory on a hierarchy of needs allows individuals (persons or businesses) to approach projects with deliberate recognition of motivations; his works allow for careful prioritization and planning of resource allocation (Maslow, 1943; Matias et al., 2020; Wong et al., 2020; Casale & Flett, 2020). Luisa Maffi is associated with the coining of the term ‘biocultural diversity’ (BCD). Projects on this topic tend to have some connection to conservation efforts and have strong ties to linguistics (particularly endangered languages); according to Maffi, biological, cultural, and scientific complexity add to the various forms of diversity in life (Terralingua, 2019; Maffi & others, 2007; Maffi, 2005; Terralingua, 2020; Terralingua, 2014; Maffi & others, 2007; Maffi, 2005). Gonçalves is a linguist specializing in multiple languages and their presence in social media like Twitter, yet manages to visually tie his findings to his appreciation of the physical environment from which the messages originate. His works seem to have cross-domain appeal; in fact, one could assert that these could have some significance to the topic of ‘biocultural diversity’.
Gonçalves’ works on the ranking of touristic-sites worldwide based on their attractiveness (measured with geolocated data as a proxy for human mobility), and ‘mapping world languages through microblogging platforms’ (Bassolas et al., 2016). He also worked on ‘learning Spanish dialects through Twitter’ by employing a corpus based on geographically tagged messages (Mocanu et al., 2013; Gonçalves & Sánchez, 2014), and exploring a global language network (as a form of ranking languages) (Ronen et al., 2014), were all visually fascinating.
However, replicating his works, with the inclusion of a creole language may be a bit difficult. Still, one may yet garner inspiration from his 2017 paper on ‘Semantic homophily in online communication’ employing confirmation from Twitter (Šćepanović et al., 2017); his discussion of ‘value homophily’ as it relates to one’s ‘internal states’ that might influence future behavior, can be adapted to discuss the similarities yet differences between the islands that observe mutually intelligible creoles (like the Antillean creole spoken among the inhabitants of Dominica and Saint Lucia). Based on the works of Wang, Abeillé, Gonçalves, and even Maffi, one may ponder the possibility of leveraging existing tree-banks and focusing on POS tagging and sentiment analysis, for languages that do not have an extensively established digital presence or corpora.
Bearing this in mind, it may also be useful to leverage the natural annotation provided by crowdsourced online dictionaries related to low-resource languages, such as Wiwords. For example, this could actively/continuously address word sense disambiguation issues that arise from the text analysis of the Saint Lucian Kwéyòl’s online social media chatter on various platforms, like Facebook, Instagram, Twitter, etc. By improving word sense disambiguation issues with the creole language, one may be better able to evaluate the language’s vitality via assessing the frequency of its usage in social media posts.
For example, avocado translates to zabòka2 in the official Saint Lucian Kwéyòl dictionary. However, on Instagram3, Facebook4, and Twitter5 (and even a book on Amazon, refers to the same item using the spelling ‘zaboca’, ‘zabocca’ 67, and ‘zaboka’8. Yet, the crowdsourced online dictionary Wiwords notes the same item with the spelling ‘zaboca’9 and includes pictures for clarification.
A cursory search of Twitter revealed that the ‘zaboca’ spelling was just about as common as ‘ zaboka’, yet the accented spelling, ‘zabòka’, was not present. One could say that Wiwords did indeed reflect the term’s typical informal (social media) form. While Wiwords does not list the official spelling or all other alternative spellings, it does assist with understanding instances of the presence or absence of diacritical marks and providing needed context to improving the word sense disambiguation.
In terms of creole data, I currently have access to traditional folk songs (Joseph, 2020; Yannucci, 2020), discourses (Weekes, 2014; Frank, 1990), and stories (SIL, 1989). Digital versions of a bible (new testament) (“Tèstèman nèf-la: Épi an posyòn an liv samz-la,” 1999; “Saint Lucian Creole French New Testament,” 2004; “Glory (Saint Lucian Creole French),” 2020), a few documents from the government such as the Saint Lucian national anthem (“Kwéyòl National Anthem approved,” 2016) and Kwéyòl public service announcements (Louisy, 2004; SLU, 2020) are also accessible. Public social media data are also useful, particularly the postings by verified Saint Lucian Kwéyòl writers (Joseph, 2020; Joseph, 2020). In 1993 a document entitled “Dances and Songs from a Caribbean Island” was created and in 1996 one named “Select Bibliography of the Literature of the English-speaking West Indies” was also created; if the documents above are insufficient, I will consult these two anthological works for additional data (“Musical Traditions of St. Lucia, West Indies,” 1993; Carnegie, 1996) There are few books but they are mostly physical documents (I need to finish looking into their digital versions).
I also have access to a partially labeled XML version of the Saint Lucian Kwéyòl dictionary dataset (but no sentiment data) (missing reference). There are numerical indicators to signify the differences between homonyms, and according to the dictionary’s creator
‘the set of parts of speech, or word classes, used in this dictionary is as follows: N (noun), PRO (pronoun), ADJ (adjective), ART (article), V (verb), ADV (adverb), PREP (preposition), CONJ (conjunction), and INTERJ (interjection). These are only broad categories. In a more complete a grammatical description of Kwéyòl, these broad categories could and should be further broken down into subcategories’ (missing reference).
In terms of technology and tools, I have found Python to be very useful. Please note that Github has proven useful in allowing one to explore various topics through shared community efforts; there are Universal Dependencies of various languages like Tamil, Spanish, and French, however, there are creole ones as of yet. I am currently reviewing a few of Abeillé’s works, including “The Universal Dependency version of the French Treebank” (Abeillé et al., 2003) - ‘UD_French-FTB’, and the presentation on ‘Deep Universal Dependencies’, created by Daniel Zeman. It is stated to be a ‘treebank of sentences from the newspaper Le Monde, initially manually annotated with morphological information and phrase-structure and then converted to the Universal Dependencies annotation scheme’.
Additionally, Google Colab has proven to be an interesting environment that facilitates document and code sharing and commenting, and version control. One also has access to a GPU, which tends to be useful for deep learning and some parallelizable data science algorithms. Additionally, there is access to a research ‘Seedbank’ where Google has an entire repository of deep learning and data science proof of concepts hosted. This includes code concerning ‘Neural Translation with Attention’, ‘Tensor2Tensor: Translate from English to German with a pre-trained model’, and even a ‘Multilingual Language Model Dataset’s.
While Python is so useful, I am, however, also currently reviewing the treebank construction feature and pre-trained annotation models offered by ‘UDPipe Natural Language Processing - Text Annotation’. According to its website, the most relevant features would be:
- Allowing R users simple access in order to easily tokenize, tag, lemmatize or perform dependency parsing on text in any language
- ‘Provide easy access to pre-trained annotation models’
- Allow R users to easily construct your own annotation model based on data in CONLL-U format as provided in more than 100 treebanks available at https://universaldependencies.org (Wijffels, 2020).
Ultimately, improving the status of an endangered language through careful application of the technology would also exemplify the core principles of biocultural diversity. Overall, this investigation has implications across various domains including data science, computational linguistics, digital humanities, and biocultural diversity. Bearing this in mind, this will open up the publication opportunities for the various aspects of this dissertation.
Please note:
- 
    I do know I need to create a visual to better summarize the core concepts of the framework 
- 
    I do know I need to clean up my overall work to fit into the following format: - Statement of Research Questions or Hypothesis
- Literature Review
- Theoretical or Conceptual Model
- Data and Analytic Method
 
- 
    I need to add the references mentioned above 
Brief literature review
Setting
Due to the extended period of social disruption caused by the 2020 SARS-CoV-2 pandemic, many gained a newfound appreciation for Maslow’s theory on a hierarchy of needs (Maslow, 1943; Matias et al., 2020; Wong et al., 2020; Casale & Flett, 2020). Global citizens reflected on the loss of lives, the loss of a sense of justice, the loss of financial stability, the subsequent loss of housing stability, and an overall sense of the loss of social normalcy; concerns grew for the increasing severity and frequency of natural disasters impacting all forms of life within our complex global socio-ecological environment.
In a time of crisis, Maslow’s theory can aid in visualizing and prioritizing scarce resources according to one’s needs, and data science can be used to explore, assess, and address those needs; data science can serve as a tool of establishing a degree of certainty, in uncertain times. Often, obtaining our needs is intrinsically linked to the ease with which we can communicate - our ability to understand and to be understood. To effectively facilitate the communication of various societal needs, data science can be employed as a means of monitoring and maintaining the standards of numerous fields through the careful study of an environment’s linguistics.
Consideration of the ease with which one is able to communicate with others can impact their need for belonging. Language allows messages to be conveyed. The ability to understand and to be understood is crucial in times of great uncertainty and the constant fear of possible emergencies.
Introduction to language study: creole
In 1863, August Schleicher, the German linguist, discussed languages as natural organisms that come into being, develop, age, and die according to laws that are quite independent of man’s will (cited by Arens 1969:259). This perception of language is contrary to recent the modern view that it is one’s set of communication habits that have principally been defined by their social experience, led by an innate sense to decipher and learn the language practices of others. However, despite having a few members of the “native speaking generation” left, often communities lack the implementation of structures dedicated to the retention of language. Therefore, successive generations may require resources that express the necessity of the language as well as its desirability to ensure its continuance.
Mohan is of the view that living languages can die before “the last trace of memory has vanished”; that these languages may be experiencing a feigned guise of life through its observation of “sympathetic post-users from outside its system” (1979:42). To further quote Mohan, an obsolescent language:
is actually dead before its forms have totally disappeared, in two different senses. A small part of the non-native speaking generation has preserved dead tokens of the language. Also, the time gap between the age of the youngest native speaker and the latest possible age of language acquisition, in infancy, shows the language dead at its source, but with a now finite community of native speakers continuing, like the earlier light of a dead star, to travel its original course and give an illusion picture of vitality. (313)
- Bad format and sourcing for this paragraph. Need to clean this up and add the references mentioned in the paragraphs above. Some chunks were copied and pasted so I would not forget to include them; how do you quote someone who quoted an original writer, when you have no access to the original document??? Plus I am unsure if this is a proper academic website that I can quote from
Moreover, Sasse (2001) lists ten changes occurring during language obsolescence; several of which bear similarities to those described by Campbell & Muntzel (1989) and Palosaari & Campbell (2011), namely: loss of phonological distinctions, regularization of morphophonemics, loss of function words, analyticity, loss of morphology, loss of syntactic complexity, agrammatism, phonological and grammatical variability, reduction of vocabulary, and an increase in polysemy.
Evaluating and creating structures is crucial for the preservation and promotion of endangered languages.
- 
    Bad format and sourcing for the above paragraph. Need to clean this up and add the references mentioned above. 
- 
    Perhaps could include and expand on the definition and application of systems dynamic model and other models here… 
Perceptions of Saint Lucian Kwéyòl
The Caribbean has had a complicated past of trade, colonization, slavery, indentured labor, and more recently voluntary immigration. This, therefore, leads to a present setting of an actively multilingual environment that is ever-changing due to evolving political and legal policies (such as Citizenship by Investment (CIP) (Bayat & El Hachem, 2020; GIS, 2017; CIP, 2020; Capital, 2020; Harvey, 2020; Visa, 2016).
English is the main language spoken (a prestige language) in Saint Lucia, however, Saint Lucian Kwéyòl (Antillean Creole/ Patios/ Patwa) is the heritage language of the island; there are several other languages officially taught (French and Spanish) and generally spoken in the close-knit Caribbean (such as Dutch, Portuguese, Hindi, Arabic, and even Japanese, Mandarin and increasingly Russian) (Hillman & D’Agostino, 2009; “Saint Lucia,” n.d.; Kobayashi, 2020; Nesheim, 2020; CBF, 2020). Frank’s 2008 work on “Sources of St. Lucian Creole Vocabulary” suggested that about 83.8 % -95.0 % words he came across had French origins, about 2.8 % to 3.2 % were English, .6 % to .7 % were Amerindian-based, and about .5 % to .6 % were African-based; .4 % were Indian language-based (Tamil or Hindi), and .1 % was Spanish/Portuguese-based. A notable 11.8 % of lexical items had etymological sources could not be precisely determined (Frank, 2007); addressing this etymological mystery might be an apt challenge for natural language processing and machine translation.
According to Douglas Midgett’s work on the anthropological linguistics of Saint Lucia, the island’s inhabitants have struggled in their appreciation of its heritage language (Midgett, 1970). Historically there were negative connotations with Saint Lucian Kwéyòl (Antillean Creole/Patios), and upon emancipation, there was intense enmity among English and Creole in the newly free community; the use was negatively equated “with all that is backward, rural, Negro, and unsophisticated” (Midgett, 1970). Midgett highlighted that the formal teaching of Patois would be viewed as “unjustifiable and in any case, would never be tolerated by even its most ardent user”.
Ultimately, Midgett did, however, acknowledge that “English is the language of all national institutions… Patois is the language of folk institutions”; therefore, it is not surprising that some are still encouraged to learn and converse in English, rather than in Saint Lucian Kwéyòl, to appear more professional or polished (OAS, 2018). At the time, he noted that most people agreed that increased proficiency in spoken and written English (or French) would be an educational must, however, writers and other academics and actual educators had differing views; the former group believed that the use of creole in schools could aid with recognizing English as a second language, whereas educators adamantly argued against any use of Creoles in the schools.
He believed that the use of Patois in the schools interchangeably with less formal, more colloquial English would aid in establishing English in the minds of students as a functional Patois equivalent. He suggested that as long as educational institutions reinforce the conventional traditional opinion of separating the two languages, the campaign for English literacy and spoken usage will not have widespread effectiveness. Midgett, however, underestimated how pervasive the English language would be; in fact, it is the creole language that currently lacks proper literacy among the public (Midgett, 1970).
- Need to clean this up and add the references mentioned above; might need to include a clear explanation of what is creole and how it differs from a pidgin.
Nonetheless, as of May 1970, Midgett suggested that the vitality of the language was still strong; he stated that “there still exists a situation in which virtually every native-born St. Lucian can speak Patois” (Midgett, 1970). This language still appears to be surviving today, albeit under an inconclusive status; despite the recent surge in appeal due to governmental and pop culture support, the lingering lack of definitive vitality data could inadvertently permit an unabated decline (Marmion et al., 2014; Midgett, 1970; Kabir, 2020).
In 1998, Frank explored and even expanded the written form of the creole language in Saint Lucia while attempting to effectively translate an English bible into the local language. Upon concluding his tasks he remarked that the bible would indirectly boost creole literacy through the motivational passages of the bible:
‘… for all practical purposes Creole remains an unwritten language for the majority of the population, which remains unaware of the books published in Creole. Attempts to teach Creole literacy have not met with much success because of lack of interest. Motivation is the most important factor in the success of any literacy program, and having something people want to read is the most important motivating factor’ (Frank & Frank, 1998).
- Need to clean this up. Perhaps spend some more time rewording this to use less of the direct quotes, and shorten this paper.
Issues with natural annotation (Challenges and Solutions for Annotating Saint Lucian Kwéyòl)
As noted before, Saint Lucian Kwéyòl is highly influenced by ‘imported vocabulary’; the majority of which is French followed by English, however, it should be noted that the third largest portion of the language has definitive etymological sources. These words may constitute out-of-vocabulary (OOV) regarding a standard English or French treebank and could result in difficulties for using English-trained tools on Saint Lucian Kwéyòl (Wang et al., 2020). Moreover, in terms of topic prominence, this language is regarded as a ‘Subject-Predicate-Object’ ordered language (Frank, 1992).
- 
    I do want to include a bit of discussion about the creole sentence structure, clauses, and the lingering impact of languages such as French, English, and Spanish. This may involve the discussion of the common Subject+Verb+Object (SVO) structure, versus the Subject+Object+Verb (SOV), etc. This sentence structure typically notes a noun phrase before a verb phrase (or a Subject+Predicate(+Object) combination at the clause level), however, structures of creoles sentences may not be so simple. 
- 
    I also might want to mention the possible issues with transliteration when dealing with the etymology of words that non-Latin based characters that still somewhat contribute to the creole vocabulary; this mostly would focus on target languages of Tamil and Hindi (Frank, 2007). Also, academics have suggested that dispute their obvious connection, there is not always singular and direct relationships between Kwéyòl and French words (missing reference). Frank highlighted that there are many cases where a Kwéyòl noun originates from a French preposition and/or article plus noun; for example, Kwéyòl’s “lavi” is related to French’s ‘la vie’, “nanj” to ‘un ange’, “zòdi” to ‘les ordures’ and “dlo” to ‘de l’eau’ (missing reference). 
It is also important to note that additional natural language processing issues may arise when manipulating the creole language, or attempting to adapt it to a digital environment such as a comprehensive online-dictionary. Public contributors may understand the language in terms of speaking it, however, they may not be the best teachers; that is to say that contributors may not always be clear in their explanations or contributions. Overall, straying from official spellings of words can contribute to data entry issues. For example, one can observe the accents used with “chofé” - “to heat up”, and “chofè” - a “driver” (missing reference). These words without context and accents would be very difficult to decipher. Persons inaccurately applying accents, or unable to access the necessary unique characters were required (diacritics) for formal grammar can present problems in documenting the language.
As noted by Frank, some dictionary entries can present as variants of keywords, and debates can arise concerning which word should be the dominant standard form, and which should be the variant form (missing reference). For example, the Saint Lucian Kwéyòl dictionary notes jòdi as the standard form of the adverb ‘today’, and hòdi as the variant form (missing reference). Frank is said to address the issue of determining the Kwéyòl form of words by noting the ‘commonly-used form that is closest to the French origin (or, in some cases, origin from another source) was chosen for a full entry, and other forms less directly related to the form of the etymological source were said to be the variants’ (missing reference).
Problems of polysemy in Saint Lucian Kwéyòl
Another issue with dictionary compilation in this region may deal with the overlap of meanings attributed to certain words. Polysemy is a common occurrence in the low-resource language of Saint Lucian Kwéyòl (Mayeux, 2019; Cope & Schafer, 2017). An indication of a polysemous verb in English is one that corresponds to different verbs when translated into other languages. For example, one can review the English word for ‘ask’ (for information) and ‘ask’ (for action). This can be interpreted as one word “vragen” in Dutch, however, the majority of other languages use different words for each English interpretation like “fragen” and “bitten” in German, “preguntar” and “pedir” in Spanish, and “fråg” and “bedja” in Swedish, respectively (Kreidler, 1998; Padó & Lapata, 2005).
In Saint Lucian Kwéyòl, this can be seen where the term “mwen” can signify the pronouns for ‘I, me, my, or mine’; the term “asou” is a preposition that can mean ‘on, on top of, atop, upon’, ‘ off, off of, from’, ‘toward’ or ‘about, concerning’ (missing reference). The term “vè” can mean ‘glass’, ‘green’, or ‘worm’; the creole word ‘kay’ can indicate future tense as well as ‘house/, building’, ‘scale’ (appearing on the skin), and ‘reef’. “Lè” can mean ‘room’, ‘space’, and ‘time/ hour’ (including discussions of ‘when’ and ‘if’). “Tan” can represent nouns of ‘time’, ‘weather’, and an adjective indicating a ‘vague amount of something’. As a noun, “kwi” could represent the act of ‘crying/ screaming/ shouting’, as well as refer to a ‘calabash bowl or plate’; as an adjective, it could indicate the state of being ‘raw’ (missing reference).
It is important to note that there may need to be considerations for word sense disambiguation issues that arise from the intermingling of an almost identical creole from the neighboring island of Dominica. Both islands appear to sound similar and some words are indeed the same, but the written form appears to vary slightly. This might be due to the liberties taken by the different authors of their dictionaries; Frank highlighted his penchant for leaning on French when considering the spelling of words (missing reference). Overall, it appears that while both countries agree that the Creole writing system is phonemically-based (which can make it easier to learn than English to a certain extent) (missing reference), there are slight differences in diacritic use and placement and spelling between countries; this issue can then permeate and linger in both creole speaking countries.
It is said that the shared creole alphabet writing system arose out of two creole ethnography workshops held in St. Lucia in January 1981 and September in 1982; this was developed through the efforts of researchers at “the University of the West Indies (U.W.I.), The Université Antilles – Guyane groups from St. Lucia (MOKWÉYÓL), Dominica (K.E.K.) and the Groupé d’Etude et de Recherche en Espace, Creolophone (GEREC) from Martinique and Guadeloupe” (of Dominica, 2020). Dominicans write ‘goodnight’ as ‘bon swé’, whereas Saint Lucians write ‘bonswè’ (missing reference). Additionally, take a look at the days of the week; the words for Sunday, Monday, Tuesday, and Saturday are the same, yet, Wednesday, Thursday, and Friday are different. Dominicans write Wednesday as Mèkwédi whereas Saint Lucians write it as Mékwédi, and Dominicans write Thursday as Jèdi whereas Saint Lucians write it as Jédi; the accent placement is different (missing reference). Dominicans write Friday as Vanwédi whereas Saint Lucians write it as Vandwédi; here, while the accent is the same, the Saint Lucians appear to include an additional ‘d’, reminiscent of the original French [< Fr. vendredi] (according to Frank (missing reference)).
Even the word, “Creole” can be viewed as a contested, polysemous term in the English language (Cope & Schafer, 2017). The term has been employed at varied periods and in several regions to distinguish a wide range of entities; this includes ‘identities, ‘languages, peoples, ethnicities, racial heritages, and cultural artifacts’ (Cope & Schafer, 2017). As an adjective, Creole was applied as an indicator of higher status bestowed upon Louisiana-born slaves to distinguish them from those born in Africa (Cope & Schafer, 2017; Brasseaux, 2005). It was also used as a noun to designate local birth in Louisiana, regardless of racial heritage; later Americans used creole when referring to people of Spanish or French descent, yet it has often been conflated with the term “Cajan” (which described French colonists that settled in Canada’s Acadia region, then migrated to Louisiana). In fact, for some time, there was also a misconception that the term only referred to whites born in Louisiana (Brasseaux, 2005).
Currently, Creole primarily refers to one’s linguistic heritage as the main source of their ethnic identity (‘often French culture and a unique Franco-linguistic dialect’) (Cope & Schafer, 2017). This is particularly true of those of mixed or ancestry foreign to the location (“Creole peoples,” 2020; Cope & Schafer, 2017). In the Caribbean, the terms ‘Creole’, ‘Kreyol’, ‘Kweyol’, or Kwéyòl can also indicate the regional creole languages such as Antillean Creole (Dominican and Saint Lucian Kwéyòl), Haitian Creole, and Jamaican Creole (“Creole peoples,” 2020; “Antillean Creole,” 2020). While most are recognized as french based, there is an increasing academic argument to officially recognize unique variants of English based creoles (instead of simply viewing them as the poor application of English) (Irvine, 2020; Irvine, 2020).
Polysemy may indeed present an issue when attempting to study a language or dialects, however, these complications are in fact, often viewed positively in the cultures where creole is spoken; they are often the premise and appeal of much literature in these creole languages. Calypso, and most other endemic forms of music, may celebrate this ability to utilize words or phrases bearing double meanings to indirectly discuss topics that are often crude (Stephens, 2013). Philips suggested that Calypsos can engage this method via ‘lamina lyrics’. Much like an onion, these Calypsos have a number of different levels of meaning, concealed one underneath the other. Achieving this phenomenon, Calypsonians use frames and masks that manifest in Calypsos as a metaphor, metonym, polysemy, irony, and satire (Phillips, 2006).
Consideration of digital tools for language analysis, rejuvenation, and conservation
- 
    Need to discuss treebanks, universal dependencies, dependency parsing performance, POS tagging accuracy, etc. 
- 
    Need to discuss sentiment analysis, etc. 
The current LIWC dictionary is composed of 5,690 words and word stems. Therefore, this means that some words are not accounted for, and to a broader extent translation does not always indicate sentiment; concepts such as sarcasm, irony, and slang, polysemy, and captionyms can be problematic if the context is not adequately communicated.
- Need to discuss online dictionaries as these serve as important resources to low-resource languages (might need to add something more to the details below after noting the pros and cons of their existence).
Online dictionaries allow users and compilers opportunities far beyond those of traditional publishing constructions. For an endangered language, the use of such a resource becoming increasingly essential. However, the use of static online dictionaries may not be sufficient to meet the demands of language that is slowly dying. Static dictionaries with dynamic qualities may better serve the needs of cataloging an endangered language.
- (this may be an area that I want to investigate/write a paper on, and possibly create a future tool to address this issue).
Online dictionaries benefit from using technology to update their records regularly. This not only pertains to new sections of the alphabet but “also with words of particular interest to its users at the time”. Words such as ‘LOL’ and ‘YOLO’ were online slang that were added to the Oxford English Dictionary (OED) in 2011, and 2016 respectively. The third edition of the OED appears to no longer suppress widely‑used slang terms, however, it certainly is incapable of documenting the meaning and use of every fleeting term used among small assortments of speakers within the entirety of the English‑speaking world.
The Urban Dictionary is the free service of an online dictionary of contemporary English slang usage; it is a “collaborative project of over 1 million definitions for over 400,000 unique headwords”. It serves as an online lexicography where public contributors explain the meanings of words and phrases not readily covered by traditional dictionaries; contributors collaborate, cooperate, and compete for meaning-making. Eventually, depending on the proliferation of use in society, official dictionaries, such as OED, may elect to officially adopt a word from such a collection.
The collaborative compilation of online lexicography, therefore, is known to linguistics. In fact, there is a webpage, similar to Urban Dictionary, that strategically caters to Caribbean consumers. While the region bears various peoples with differing complex histories, they do share similarities; while most of these similarities are the unfortunate result of colonialization from similar superpowers, certain words are ubiquitous due to the ease of traversing and settling in neighboring countries. Persons of these areas experience similar flora and fauna, and environmental and social phenomena that are endemic to their setting and way of life. The website, Wiwords, is a dynamic online dictionary that highlights instances of shared vocabulary, as well as allowing for contributions that are unique to certain countries in the Caribbean.
It should, however, be noted that the number of items tagged as Saint Lucian Kwéyòl appears to be a bit low in this particular online linguistic community. For example, there are no contributions to the Wiwords section dedicated to local quotes or sayings, when these items are quite abundant in Saint Lucian Kwéyòl. Most participation Saint Lucian tagged items appear to pertain to discussions of food, and animal and plant life. Therefore, to better preserve the creole language within Saint Lucia, utilization of this dynamic online community dictionary would be very useful to gauge the language’s vitality.
- Need to clean this up and add the references mentioned above.
Investigation
Some works have suggested dealing with the lexical deviations inherent to creoles can be accommodated by applying the existing language (like French or English) dependency relation definitions while ensuring consistency with the annotations in other non-French or non-English Universal Dependency (UD) treebanks.
It does appear that it would be helpful to develop a framework to improve the models for linguistics; perhaps an extendable framework that can demonstrate its application to low-resource languages such as creole. It begs the question, how can one leverage existing tree-banks, focusing on POS and sentiment analysis, for languages that do not have an established digital presence (corpus).
Additionally, this ultimately leaves one to wonder if it is logical and possible to merge an official online static dictionary with a dynamic slang dictionary to increase overall fluency in an endangered language? Such a system would require access to an existing dictionary, as well as opportunities for dictionary contributions. The creation of an interactive dictionary may assist with reinvigorating the usage of the language through active use and contributions to this resource, but also offer opportunities to actively discuss and clarify words and concepts associated with creole.
- Need to clean this up and add the references mentioned above.
Implications for the Field of Study: The work’s significance and contribution to the field of Data Science and cross-domain appeal.
As noted above, this paper intends to focus on computational linguistic (and machine translation) skills to address societal communication difficulties, particularly when dealing with low-resource languages. Introducing a framework to natural language processing will improve computational linguistics as when as data science, as both tend to use the same or similar models and technologies. However, there are additional related fields that may benefit from introducing frameworks that ultimately improve communications; particularly in low-resource/ endangered languages. Therefore, this work can also be considered to explore the possible cross-domain approach to ethical data science projects through highlighting the similarities of skills and technologies associated with, and the possible merger of, the topics related to digital humanities and biocultural diversity. Overall, the Saint Lucian Kwéyòl/ Antillean creole can benefit from utilizing online language resources.
During the 2020 SARS-CoV-2 pandemic, global citizens began critically assessing existing communication issues and problematic policies, and strove to create and cement beneficial policies for the future. It appeared that genuine value could be generated from studies in the areas of Digital Humanities (DH) and Biocultural Diversity (BCD), as these fields specifically focus on the intrinsic connection between societal structures and its possible harmony with their surroundings.
Digital Humanities can be interpreted as the intersection of computing (or digital technologies and humanities disciplines; it is regarded as the technology space where many digital humanists undertake the process of “translating texts into digital spaces and data or translating digital and quantitative information into new texts and interpretations” (Levenberg et al., 2018). The field of digital humanities is quite diverse, and flexible, meaning that skills acquired can be applied to various areas. For example, an understanding and appreciation of linguistics is often helpful when working in this field; a digital humanities task related to linguistic analysis and marketing could entail an analysis of YouTube Play comments to aid in answering a question about targeted audiences (Levenberg et al., 2018).
The ability to reliably collect and manipulate a large set of texts is crucial; one may employ computational techniques from “natural language processing, corpus linguistics, distant reading, or broad reading” (Levenberg et al., 2018). These techniques assist with data processing; this can encompass data aggregation and analysis of large sets of structured or unstructured texts (Levenberg et al., 2018). Depending on the purposes of their research, it is said that different disciplinary approaches use distinct sets of algorithmic or programmatic approaches to interpreting the contents of texts (Levenberg et al., 2018).
Digital humanities can be the strategic and systematic implementation and utilization of digital resources on the study of the humanities and further analysis of their application. For example, this can take the form of works in archeology. Undertakings can produce an intricate reconstruction of the social and environmental history of various areas of a bygone era (like developing “Patterns of Etruscan Urbanism” (Stoddart et al., 2020)); this area is particularly useful when attempting to process historical documents related to time data (Ortman, 2019). Additionally, these skills can be useful in recreating documents (Ortman, 2019), catalog images (Ziegler Delgado, 2020), using Computer Vision (Cornia et al., 2020), and using various forms of machine translation to handle text processing (Ye & Boot, 2020).
- Need to expand this area a bit; may need more sources.
In Mattmann’s “vision for data science” he suggested that “for the specialism to emerge and grow, data scientists will have to overcome barriers that are common to multidisciplinary research” (Mattmann, 2013). Luisa Maffi was the co-founder and director of Terralingua and a pioneer in the Biocultural Diversity domain of research (Terralingua, 2019; Terralingua, 2020; Terralingua, 2014; Maffi & others, 2007; Maffi, 2005). She revealed the interrelated (and possibly coevolved) nature of the diversity of all forms of life within a complex socio-ecological adaptive system. BCD projects tend to have some connection to conservation efforts, and have strong ties to linguistics; according to Maffi, biological, cultural, and scientific complexity add to the various forms of diversity in life (Terralingua, 2019; Bates et al., 2020; Buckley, 2020; Corlett et al., 2020; Maffi & others, 2007; Maffi, 2005). BCD has been regarded as an evolving perspective for examining the interconnection of living beings and their environments, particularly regarding conservation efforts (Myers et al., 2000); one source regards it as highlighting the “interrelatedness between people and their natural environment” (Buizer et al., 2016). Maffi has dedicated much of her work to linguistics (Maffi, 2005), particularly the preserving and reinvigorating endangered languages (Maffi, 2002; Maffi, 2003).
While BCD adaptable to every environment, it is very apt for positive projects in rural areas or the countryside communities (with limited infrastructure) and even indigenous communities (Terralingua, 2019; Terralingua, 2020; Terralingua, 2014; Maffi & others, 2007; Maffi, 2005). Often these communities highly value their environment as it sustains them. For example, Brazil is a country with vast natural resources, and it is home to various indigenous communities (with many unique languages).
In one Brazilian study, the thoughtful application of BCD actions aided in counteracting adverse social outcomes where government interference in local nature governance occurred (Mooij et al., 2019). This work paired BCD with a Practice-Based Approach (PBA) to reduce miscommunication or inaction in the community–government interactions; this was an effort to ensure that the essence of typical, or daily, issues between local communities and ruling governments was not lost when those officials communicated the community’s financial aid concerns to domestic and international audiences. To create a more complete picture of human-nature interaction, and a better understanding of the efficacy and acceptance of conservation interventions, locals were able to express their basic practices relating to the local environment (such as hunting preferences), and they were able to receive advice and resources for sustainable practices and livelihoods. The PBA approach facilitated subjects mixing “external plans with their own logic and respond to outsiders’ plans according to their own logic”; thus, allowing for increased communication and contemplation of community issues and needs. Prudent reflection of BCD was said to have aided in detecting (non-)compliant behavior “that would have otherwise likely gone unnoticed” (Mooij et al., 2019). Therefore, BCD broad applications in the community–government interaction, and can aid in the careful development of communities that require extensive development resources.
Based on the successes noted in Brazil, BCD projects may facilitate helpful developments in other countries, including small island nations (Simbiak et al., 2019). Island countries such as Saint Lucia, in the Caribbean, has a local heritage language of Saint Lucian Kwéyòl language that is currently in a precarious state; no language vitality census has been attempted since the late 1940s (Irvine, 2020; Hilaire, 2008; St-Hilaire, 2011). The vocabulary of this language is intrinsically tied to objects in the natural environment. Moreover, the country’s national cultural policy outlines even outlined a desire for Saint Lucian Caribbean people to “be aware of the importance of living in harmony with the environment” (UNESCO, 2017; CDF, 2019).
Another work acknowledged Maffi’s expression of concern for the twofold destruction of local cultures and wilderness in her early writings, yet the work highlighted the possible positive outcomes with diligent proactive efforts. It was believed that acknowledgment of BCD would facilitate a move from a “crisis narrative to a dynamic narrative” (Elands et al., 2019). Additionally, the work highlighted that BCD studies do not solely occur in rural areas; observations could be made in cities to assist with improving living standards and encouraging urban green infrastructure projects (Elands et al., 2019).
Cities are increasingly considering BCD projects as a means of rapidly ameliorating living conditions. BCD has been recognized for its ability to foster creative solutions to addressing environmental concerns, and aid in a more forceful embedding of ecology into decision-making (beyond standard ecological concepts such as Ecosystems Services (ES)) (Buizer et al., 2016); one work insisted that researchers and policy-makers carefully consider including BCD in their “conceptual repertoire” related to environmental projects as a means of addressing the “value-orientations of all stakeholders” (Buizer et al., 2016).
While the concept of biocultural diversity is relatively novel, there are increasing bodies of works constantly being developed, even during the 2020 SARS-CoV-2 pandemic. In fact, the pandemic appeared to be creating a unique opportunity to academically assess the interrelatedness of entities existing and enduring through the crisis; as noted earlier, there are unusual increases and decreases of interactions between humans and animals, and between humans and their environments (Reyes-Valdés & Kantartzi, 2020; Bates et al., 2020; Buckley, 2020; Corlett et al., 2020).
For example, one new work during this time acknowledged the concept’s typical associations with transdisciplinary projects and the concept’s involvement with linguistic, cultural, and biological research methods in tandem with numerous statistical and mathematical approaches (Reyes-Valdés & Kantartzi, 2020). However, the work then demonstrated the adaptability of the concept to biological data, by connecting transcriptome analysis bearing a biological interaction structure of a cultural group. Unique relationships were able to be identified through the lens of the cultural group component. The findings were scientifically supported, as it was said that “the specificity and specialization indices have a direct mathematical relationship with the biocultural complexity, which can be interpreted as the effective number of biocultural units equivalent to the observed data” (Reyes-Valdés & Kantartzi, 2020).
- May need to expand this area a bit; may need more sources
Motivation is essential to change, however, to properly allocate resources, data collection and analysis should occur; based on the details above, it does appear that the areas of digital humanities and biocultural diversity do acknowledge and respect careful scientific observation and accounting of findings. Data science offers a plethora of tools in which to explore our world. However, it is often a difficult task to decipher what tools to use and when. Developing the skill and capacity to achieve this requires quite a bit of study. It also requires a significant degree of concern and compassion to dedicate resources to research that may not have a direct, immediate, or substantial financial reward.
Publication Plan
The publication process can be viewed as an essential part of one’s Ph.D. solidification in the field. While publications before graduation from a program can at times be difficult, it may be advantageous to attempt this as part of one’s ultimate success in the field. A publication plan allows one to market their results to places that are more apt to appreciate one’s works; this allows one to ensure that their contributions are better recognized by industry professionals. Some have viewed such publications as providing a powerful and significant indicator for ‘scientific career advancement’ (Giebel, 2019); this is particularly important in a highly competitive field.
Based on the level of a literature review I intend to complete concerning the creation of a framework and its application, I intend to publish a view papers/chapters before my final dissertation presentation (as soon as possible). My main target is a recently created publication geared towards data science papers - Patterns. It is said to publish
‘…original research in data science, particularly focusing on solutions to the cross-disciplinary problems that all researchers face when dealing with data, and articles about datasets, software code, algorithms, infrastructures, etc., with permanent links to these research outputs’. Patterns also promotes cross-community conversation by publishing opinion pieces and review articles (“Patterns,” 2020). Since I do intend to discuss matters related to computational linguistics, frameworks, and my dissertation’s societal value to low-resource languages (biocultural diversity topics), I do believe that this is an appropriate target. However, this does not mean that other publishers will be ignored; I am also open to suggestions and changes concerning this matter. I will also review the Please see below for details on publications currently being considered.
- Patterns:
https://www.cell.com/patterns/home
- Journal of Pidgin and Creole Languages:
https://www.jbe-platform.com/content/journals/15699870
- Applied Corpus Linguistics:
https://www.journals.elsevier.com/applied-corpus-linguistics
- Language & Communication: An Interdisciplinary Journal:
https://www.journals.elsevier.com/language-and-communication
- Lingua: An International Review of General Linguistics:
https://www.journals.elsevier.com/lingua
- Ampersand: An International Journal of General and Applied Linguistics:
https://www.journals.elsevier.com/ampersand
- Journal of Memory and Language:
https://www.journals.elsevier.com/journal-of-memory-and-language
- Linguistics and Education: An International Research Journal:
https://www.journals.elsevier.com/linguistics-and-education
- Journal of Second Language Writing: An international journal on second and foreign language writing and writing instruction:
https://www.journals.elsevier.com/journal-of-second-language-writing
- Language Sciences:
https://www.journals.elsevier.com/language-sciences
- Modern Journal of Language Teaching Methods (MJLTM):
https://mjltm.org/
- Elsevier Journals in Linguistics and Language:
https://www.elsevier.com/social-sciences-and-humanities/linguistics-and-language/journals
- Journal of Marine and Island Cultures:
https://jmic.online/
- Sustainability — Open Access Journal:
https://www.mdpi.com/journal/sustainability
References:
- Frank, D. B. (2007). Sources of St. Lucian Creole Vocabulary. http://www.saintluciancreole.dbfrank.net/workpapers/sources_of_vocabulary.pdf
- Crosbie, P., Frank, D., Leon, E., & Samuel, P. (2001). Kwéyòl dictionary. Castries, Government of Saint Lucia, Ministry of Education. http://www.saintluciancreole.dbfrank.net/dictionary/KweyolDictionary.pdf
- Soto, W. (2020). Language Identification of Guadeloupean Creole. Groupement De Recherche Linguistique Informatique Formelle Et De Terrain (LIFT), 53. https://hal.archives-ouvertes.fr/hal-03066031/document#page=59
- Joseph, J. C. (2020). Kwéyòl Sent Lisi. In Kwéyòl Sent Lisi. https://kweyolsentlisi.weebly.com/
- Joseph, J. C. (2020). Kwéyòl Sent Lisi. In Facebook. https://www.facebook.com/kweyolsentlisi/
- Nivre, J., De Marneffe, M.-C., Ginter, F., Goldberg, Y., Hajic, J., Manning, C. D., McDonald, R., Petrov, S., Pyysalo, S., Silveira, N., & others. (2016). Universal dependencies v1: A multilingual treebank collection. Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), 1659–1666.
- Wang, H., Yang, J., & Zhang, Y. (2020). From genesis to creole language: Transfer learning for singlish universal dependencies parsing and pos tagging. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), 19, 1–29. https://doi.org/10.1145/3321128
- Ferreira, J., Gonçalo Oliveira, H., & Rodrigues, R. (2019). Improving NLTK for processing Portuguese. 8th Symposium on Languages, Applications and Technologies (SLATE 2019). https://drops.dagstuhl.de/opus/volltexte/2019/10885/pdf/OASIcs-SLATE-2019-18.pdf
- Abeillé, A. (2012). Treebanks: Building and using parsed corpora (Vol. 20). Springer Science & Business Media.
- Abeillé, A., Clément, L., & Toussenel, F. (2003). Building a treebank for French. In Treebanks (pp. 165–187). Springer. https://link.springer.com/chapter/10.1007/978-94-010-0201-1_10
- Abeillé, A., & Barrier, N. (2004). Enriching a French treebank. LREC. http://www.lrec-conf.org/proceedings/lrec2004/pdf/562.pdf
- Maslow, A. H. (1943). Theory of Human Motivation : Psychological Review. In Classics in the History of Psychology: A Theory of Human Motivation. Christopher D. Green of York University, Toronto, Ontario . https://psychclassics.yorku.ca/Maslow/motivation.htm
- Matias, T., Dominski, F. H., & Marks, D. F. (2020). Human needs in COVID-19 isolation. SAGE Publications Sage UK: London, England. https://journals.sagepub.com/doi/full/10.1177/1359105320925149
- Wong, A. H., Pacella-LaBarbara, M. L., Ray, J. M., Ranney, M. L., & Chang, B. P. (2020). Healing the Healer: Protecting Emergency Health Care Workers’ Mental Health During COVID-19. Annals of Emergency Medicine. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7196406/
- Casale, S., & Flett, G. L. (2020). Interpersonally-based fears during the COVID-19 pandemic: Reflections on the fear of missing out and the fear of not mattering constructs. Clinical Neuropsychiatry, 17(2), 88–93. https://doi.org/10.36131/CN20200211
- Terralingua. (2019). What Is Biocultural Diversity? In Terralingua. Terralingua. https://terralingua.org/what-we-do/what-is-biocultural-diversity/
- Maffi, L., & others. (2007). Biocultural diversity and sustainability. The SAGE Handbook of Environment and Society. SAGE Publ., London, 267–277. https://doi.org/10.1002/9781118924396.wbiea1797
- Maffi, L. (2005). Linguistic, cultural, and biological diversity. Annu. Rev. Anthropol., 34, 599–617. https://doi.org/10.1146/annurev.anthro.34.081804.120437
- Terralingua. (2020). Biocultural Diversity Education. In Terralingua. Terralingua. https://terralingua.org/our-projects/biocultural-diversity-education/
- Terralingua. (2014). Our Overview of a New Approach to Education and Curriculum Development . In Terralingua. Terralingua. https://terralingua.org/wp-content/uploads/2015/07/BCDEI-Overview.pdf
- Bassolas, A., Lenormand, M., Tugores, A., Gonçalves, B., & Ramasco, J. J. (2016). Touristic site attractiveness seen through Twitter. EPJ Data Science, 5(1), 12.
- Mocanu, D., Baronchelli, A., Perra, N., Gonçalves, B., Zhang, Q., & Vespignani, A. (2013). The twitter of babel: Mapping world languages through microblogging platforms. PloS One, 8(4), e61981.
- Gonçalves, B., & Sánchez, D. (2014). Crowdsourcing dialect characterization through Twitter. PloS One, 9(11), e112074.
- Ronen, S., Gonçalves, B., Hu, K. Z., Vespignani, A., Pinker, S., & Hidalgo, C. A. (2014). Links that speak: The global language network and its association with global fame. Proceedings of the National Academy of Sciences, 111(52), E5616–E5622.
- Šćepanović, S., Mishkovski, I., Gonçalves, B., Nguyen, T. H., & Hui, P. (2017). Semantic homophily in online communication: evidence from twitter. Online Social Networks and Media, 2, 1–18.
- Joseph, J. C. (2020). Mèt Lékòl - Kwéyòl Sent Lisi . In 10 St. Lucia La Rose Songs. Kwéyòl Sent Lisi . https://kweyolsentlisi.weebly.com/megravet-leacutekogravel.html
- Yannucci, L. (2020). Fanm Ki Dou - Saint Lucia. In Mama Lisa’s World of Children and International Culture. mamalisa. https://www.mamalisa.com/?t=es&p=4060
- Weekes, T. (2014). Bodies, Memories and Spirits: A Discourse on Selected Cultural Forms and Practices Of St.Lucia. Xlibris US. https://books.google.com/books?id=8O_wAgAAQBAJ
- Frank, D. (1990). Six St Lucian French Creole Narrative Texts. In Work Papers of the Summer Institute of Linguistics in St Lucia. SIL International. https://www.dbfrank.net/saintluciancreole/workpapers/six_texts.pdf
- SIL. (1989). Sé Kon Sal Fèt - A reading book in Saint Lucian Creole. In Summer Institute of Linguistics. SIL International. https://tinyurl.com/yeajwe8b
- Tèstèman nèf-la: Épi an posyòn an liv samz-la. (1999). In Saint Lucian Creole French New Testament. The Digital Bible Society . http://downloads.dbs.org/scriptures/pdf/ACFWBT/ACFWBT.pdf
- Saint Lucian Creole French New Testament. (2004). In The Global Bible Project. Wycliffe Bible Translators. https://acf.global.bible/bible/4e558d011298deae-01/2JN.intro
- Glory (Saint Lucian Creole French). (2020). In The Translation Insights and Perspectives (TIPs). United Bible Societies. https://tips.translation.bible/story/glory-saint-lucian-creole-french/
- Kwéyòl National Anthem approved. (2016). In Saint Lucia - Access Government. Office of the Prime Minister. http://www.govt.lc/news/kw-y-l-national-anthem-approved
- Louisy, D. C. P. L. (2004). "Facing Opportunity and Taking Responsibility for Our Country’s Development". In Throne Speech. Governor General Dame Calliopa Pearlette Louisy. http://caribbeanelections.com/eDocs/articles/lc/lc_Throne_Speech_2004.pdf
- SLU. (2020). Folk Research Centre. In Facebook. https://www.facebook.com/saintluciafolk/
- Musical Traditions of St. Lucia, West Indies. (1993). In Dances and Songs from a Caribbean Island. Smithsonian/Folkways Recording. https://media.smithsonianfolkways.org/liner_notes/smithsonian_folkways/SFW40416.pdf
- Carnegie, J. R. (1996). Select Bibliography of the Literature of the English-speaking West Indies, 1989-1991. Journal of West Indian Literature, 7(1), 1–53. http://www.jstor.org/stable/23019891
- Wijffels, J. (2020). UDPipe Natural Language Processing - Text Annotation. In UDPipe. https://cran.r-project.org. https://cran.r-project.org/web/packages/udpipe/vignettes/udpipe-annotation.html)
- Bayat, S. M., & El Hachem, H. (2020). Saint Lucia - The Corporate Immigration Review - Edition 10 - TLR. In The Law Reviews. The Law Reviews. https://thelawreviews.co.uk/edition/the-corporate-immigration-review-edition-10/1227321/st-lucia
- GIS. (2017). Chamber discusses CIP changes. In Saint Lucia - Access Government. GIS. http://www.govt.lc/news/chamber-discusses-cip-changes
- CIP, S. L. (2020). CIP FAQs. In Citizenship By Investment. cipsaintlucia.com. https://www.cipsaintlucia.com/faqs
- Capital, A. (2020). Saint Lucia Citizenship by Investment - Your 2nd Passport. In Arton Capital. Arton Capital. https://www.artoncapital.com/global-citizen-programs/saint-lucia/
- Harvey, L. G. (2020). Saint Lucia Citizenship By Investment Program (CIP): HLG. In Harvey Law Group - World’s Leading Law Firm in Business Law, Investment Immigration, Citizenship-By-Investment, Residency-By-Investment. Harvey Law Group. https://www.harveylawcorporation.com/saint-lucia/
- Visa, I. (2016). CIP Restructuring In Progress for Saint Lucia. In Qicms- Saint Lucia CIP Restructuring | Citizenship by Investment. invest-visa. https://www.invest-visa.com/Post/240231/Immigration-News/CIP-Restructuring-InProgress-for-Saint-Lucia
- Hillman, R. S., & D’Agostino, T. J. (2009). Understanding the contemporary Caribbean. Lynne Rienner Publishers BoulderLondon. https://www.rienner.com/uploads/4a48e971d622c.pdf
- Saint Lucia. In Countries and Their Cultures. https://www.everyculture.com/No-Sa/Saint-Lucia.html
- Kobayashi, T. (2020). Message from the Chief Representative. In JICA. https://www.jica.go.jp/stlucia/english/office/about/message.html
- Nesheim, C. H. (2020). The Saint Lucia Citizenship by Investment Programme. In Investment Migration Insider. Investment Migration Insider. https://www.imidaily.com/the-saint-lucia-citizenship-by-investment-programme/
- CBF. (2020). Saint Lucia Presents a New CIP-Project. In Cross Border Freedom. CBF. https://www.crossborderfreedom.com/saint-lucia-presents-a-new-cip-project/
- Midgett, D. (1970). Bilingualism and linguistic change in St. Lucia. Anthropological Linguistics, 158–170. https://www.jstor.org/stable/30029245?seq=1
- OAS. (2018). Organization of American States: Democracy for peace, security, and development. In OAS. Organization of Eastern Caribbean States Commission represented by St. Lucia and the Innoved Uniq, Quisqueya University, Haiti. https://www.oas.org/cotep/LibraryDetails.aspx?lang=en
- Marmion, D., Obata, K., & Troy, J. (2014). Community, identity, wellbeing: the report of the Second National Indigenous Languages Survey. Australian Institute of Aboriginal and Torres Strait Islander Studies Canberra.
- Kabir, A. J. (2020). Creolization as balancing act in the transoceanic quadrille: Choreogenesis, incorporation, memory, market. Atlantic Studies, 17(1), 135–157.
- Frank, D., & Frank, D. (1998). Lexical challenges in the St. Lucian Creole Bible translation project. Twelfth Biennial Conference of the Society for Caribbean Linguistics, Castries, St. Lucia, 1–16. https://www.saintluciancreole.org/workpapers/lexical_challenges.pdf
- Frank, D. (1992). Clause versus Sentence in St. Lucian French Creole. https://www.saintluciancreole.org/workpapers/clause_versus_sentence.pdf
- Mayeux, O. (2019). Rethinking decreolization : language contact and change in Louisiana Creole. https://www.repository.cam.ac.uk/handle/1810/294526
- Cope, M. R., & Schafer, M. J. (2017). Creole: a contested, polysemous term. Ethnic and Racial Studies, 40(15), 2653–2671. https://doi.org/10.1080/01419870.2016.1267375
- Kreidler, C. W. (1998). Introducing English Semantics. Routledge. https://books.google.com/books?id=QFiv5tDnI_sC
- Padó, S., & Lapata, M. (2005). Cross-lingual bootstrapping of semantic lexicons: The case of framenet. AAAI, 1087–1092. https://www.aaai.org/Papers/AAAI/2005/AAAI05-172.pdf
- of Dominica, G. (2020). A Brief History of Kwéyòl (Patwa). In The Division of Culture. gov.dm. http://divisionofculture.gov.dm/creole-languages/5-kweyol
- Brasseaux, C. A. (2005). French, Cajun, Creole, Houma: A Primer on Francophone Louisiana. LSU Press. https://books.google.com/books?id=XCBfDwAAQBAJ
- Creole peoples. (2020). In Academic Dictionaries and Encyclopedias. Enacademic. https://enacademic.com/dic.nsf/enwiki/53856
- Antillean Creole. (2020). In Academic Dictionaries and Encyclopedias. Enacademic. https://enacademic.com/dic.nsf/enwiki/614299
- Irvine, M. (2020). St. Lucia Creole English and Dominica Creole English. World Englishes. https://doi.org/0.1111/weng.12519
- Irvine, M. (2020). Language contact in St. Lucia: The features and origins of St. Lucia Creole English [PhD thesis]. ResearchSpace@ Auckland.
- Stephens, M. R. (2013). Imagining Resistance and Solidarity in the Neoliberal Age of US Imperialism, Black Feminism, and Caribbean Diaspora.
- Phillips, E. M. (2006). Recognising the Language of Calypso as “Symbolic Action” in Resolving Conflict in the Republic of Trinidad and Tobago. Caribbean Quarterly, 52(1), 53–73.
- Levenberg, L., Neilson, T., & Rheams, D. (2018). Research methods for the digital humanities. Springer.
- Stoddart, S., Palmisano, A., Redhouse, D., Barker, G., Di Paola, G., Motta, L., Rasmussen, T., Samuels, T., & Witcher, R. E. (2020). Patterns of Etruscan urbanism. Frontiers in Digital Humanities., 7, 1.
- Ortman, S. G. (2019). A New Kind of Relevance for Archaeology. Frontiers in Digital Humanities, 6, 16. https://doi.org/10.3389/fdigh.2019.00016
- Ziegler Delgado, M. M. (2020). THE TIME OF DIGITAL HUMANITIES: BETWEEN ART HISTORY, CULTURAL HERITAGE, GLOBAL CITIZENSHIP AND EDUCATION IN DIGITAL SKILLS. Revista De Comunicación De La SEECI, 52, 29. http://proxy-harrisburg.klnpa.org/login?url=http://search.ebscohost.com/login.aspx?direct=true&db=edb&AN=145102698&site=eds-live&scope=site
- Cornia, M., Stefanini, M., Baraldi, L., Corsini, M., & Cucchiara, R. (2020). Explaining digital humanities by aligning images and textual descriptions. Pattern Recognition Letters, 129, 166–172. http://proxy-harrisburg.klnpa.org/login?url=http://search.ebscohost.com/login.aspx?direct=true&db=edselp&AN=S0167865519303381&site=eds-live&scope=site
- Ye, Y., & Boot, P. (2020). Machine Translation as an Alternative to Language-Specific Dictionaries for LIWC. http://proxy-harrisburg.klnpa.org/login?url=http://search.ebscohost.com/login.aspx?direct=true&db=edsbas&AN=edsbas.EE8D7521&site=eds-live&scope=site
- Mattmann, C. A. (2013). A vision for data science. Nature, 493(7433), 473–475.
- Bates, A. E., Primack, R. B., Moraga, P., & Duarte, C. M. (2020). COVID-19 pandemic and associated lockdown as a "Global Human Confinement Experiment" to investigate biodiversity conservation. In Biological conservation. Elsevier Ltd. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7284281/
- Buckley, R. (2020). Conservation implications of COVID19: Effects via tourism and extractive industries. Biological Conservation. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7247974/
- Corlett, R. T., Primack, R. B., Devictor, V., Maas, B., Goswami, V. R., Bates, A. E., Koh, L. P., Regan, T. J., Loyola, R., Pakeman, R. J., & others. (2020). Impacts of the coronavirus pandemic on biodiversity conservation. Biological Conservation, 246, 108571. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7139249
- Myers, N., Mittermeier, R. A., Mittermeier, C. G., Da Fonseca, G. A. B., & Kent, J. (2000). Biodiversity hotspots for conservation priorities. Nature, 403(6772), 853–858. https://www.nature.com/articles/35002501?report=reader
- Buizer, M., Elands, B., & Vierikko, K. (2016). Governing cities reflexively—The biocultural diversity concept as an alternative to ecosystem services. Environmental Science & Policy, 62, 7–13. https://doi.org/10.1016/j.envsci.2016.03.003
- Maffi, L. (2002). Endangered languages, endangered knowledge. International Social Science Journal, 54(173), 385–393. https://doi.org/abs/10.1111/1468-2451.00390
- Maffi, L. (2003). The “business” of language endangerment. Language in the Twenty-First Century, 67–86. https://www.google.com/books/edition/Language_in_the_21st_Century/Ca57m42RkMAC
- Mooij, M. L. J., Dessartre Mendonça, S., & Arts, K. (2019). Conserving Biocultural Diversity through Community–Government Interaction: A Practice-Based Approach in a Brazilian Extractive Reserve. Sustainability, 11(1), 32. https://doi.org/10.3390/su11010032
- Simbiak, M., Supriatna, J., Walujo, E. B., & others. (2019). Current status of ethnobiological studies in Merauke, Papua, Indonesia: A perspective of biological-cultural diversity conservation. Biodiversitas Journal of Biological Diversity, 20(12). https://doi.org/10.1007/978-94-017-8941-7_14
- Hilaire, A. S. (2008). Postcolonialism, identity, and the French language in St. Lucia. New West Indian Guide/Nieuwe West-Indische Gids, 81(1-2), 55–77.
- St-Hilaire, A. (2011). Kwéyòl in postcolonial Saint Lucia: Globalization, language planning, and national development (Vol. 40). John Benjamins Publishing. https://www.jbe-platform.com/content/books/9789027284648
- UNESCO. (2017). Technical and Vocational Education and Training (TVET) Policy Review: Saint Lucia. In unesdoc.unesco.org. United Nations Educational, Scientific and Cultural Organization. https://unesdoc.unesco.org/ark:/48223/pf0000247494
- CDF. (2019). National Cultural Policy. In Cultural Development Foundation (CDF) St. Lucia. Ministry of Social Transformation, Culture & Local Government. http://www.cdfstlucia.org/who-we-are/national-cultural-policy/
- Elands, B. H. M., Vierikko, K., Andersson, E., Fischer, L. K., Gonçalves, P., Haase, D., Kowarik, I., Luz, A. C., Niemelä, J., Santos-Reis, M., & Wiersum, K. F. (2019). Biocultural diversity: A novel concept to assess human-nature interrelations, nature conservation and stewardship in cities. Urban Forestry & Urban Greening, 40, 29–34. https://doi.org/10.1016/j.ufug.2018.04.006
- Reyes-Valdés, M. H., & Kantartzi, S. K. (2020). An information theory approach to biocultural complexity. Scientific Reports, 10(1), 7203. https://doi.org/10.1038/s41598-020-64260-5
- Giebel, M. (2019). Is it a good idea to publish during your PhD? Yes, but ... In Nature News. Nature Publishing Group. https://socialsciences.nature.com/posts/53143-is-it-publish-or-perish-for-phd-students-it-depends
- Patterns. (2020). In journals.elsevier.com/patterns. Elsevier.com. https://www.journals.elsevier.com/patterns/#: :text=Patterns is a premium open,scope, regardless of original domain.