The last and more detailed version of this file can be found here.
There are a lot of different typical tasks that have to be solved during phonetic research and experiments. This includes creating a presentation that will contain all stimuli, renaming and concatenating multiple sound files recorded during a session, automatic annotation in ‘Praat’ TextGrids (this is one of the sound annotation standards provided by ‘Praat’ software, see Boersma & Weenink 2018 http://www.fon.hum.uva.nl/praat/), creating an html table with annotations and spectrograms, and converting multiple formats (‘Praat’ TextGrid, ‘EXMARaLDA’ Schmidt and Wörner (2009) and ‘ELAN’ Wittenburg et al. (2006)). All of these tasks can be solved by a mixture of different tools (any programming language has programs for automatic renaming, and Praat contains scripts for concatenating and renaming files, etc.). phonfieldwork
provides a functionality that will make it easier to solve those tasks independently of any additional tools. You can also compare the functionality with other packages: ‘rPraat’ Bořil and Skarnitzl (2016), ‘textgRid’ Reidy (2016), ‘pympi’ Lubbers and Torreira (2013) (thx to Lera Dushkina and Anya Klezovich for letting me know about pympi
).
There are a lot of different books about linguistic fieldwork and experiments (e.g. Gordon (2003), Bowern (2015)). This tutorial covers only the data organization part. I will focus on cases where the researcher clearly knows what she or he wants to analyze and has already created a list of stimuli that she or he wants to record. For now phonfieldwork
works only with .wav(e)
and .mp3
audiofiles and .TextGrid
, .eaf
, .exb
, .srt
, Audacity .txt
and .flextext
annotation formats, but the main functionality is availible for .TextGrid
files (I plan to extend its functionality to other types of data). In the following sections I will describe my workflow for phonetic fieldwork and experiments.
Before you start, make sure that you have installed the package, for example with the following command:
This command will install the last stable version of the phonfieldwork
package from CRAN. Since CRAN runs multiple package checks before making it available, this is the safest option. Alternatively, you can download the development version from GitHub:
If you have any trouble installing the package, you will not be able to use its functionality. In that case you can create an issue on Github or send an email. Since this package could completely destroy your data, please do not use it until you are sure that you have made a backup.
Use the library()
command to load the package:
This tutorial was made using the following version of phonfieldwork
:
## [1] '0.0.7'
This tutorial can be cited as follows:
##
## Moroz G (2020). _Phonetic fieldwork and experiments with phonfieldwork
## package_. <URL: https://CRAN.R-project.org/package=phonfieldwork>.
##
## A BibTeX entry for LaTeX users is
##
## @Manual{,
## title = {Phonetic fieldwork and experiments with phonfieldwork package},
## author = {George Moroz},
## year = {2020},
## url = {https://CRAN.R-project.org/package=phonfieldwork},
## }
If you have any trouble using the package, do not hesitate to create an issue on Github.
phonfieldwork
packageMost phonetic research consists of the following steps:
The phonfieldwork
package is created for helping with items 3, partially with 4, and 5 and 8.
To make the automatic annotation of data easier, I usually record each stimulus as a separate file. While recording, I carefully listen to my consultants to make sure that they are producing the kind of speech I want: three isolated pronunciations of the same stimulus, separated by a pause and contained in a carrier phrase. In case a speaker does not produce three clear repetitions, I ask them to repeat the task, so that as a result of my fieldwork session I will have: * a collection of small soundfiles (video) with the same sampling rate, resolution (bit), and number of channels * a list of succesful and unsuccesful attempts to produce a stimulus according to my requirements (usually I keep this list in a regular notebook)
There are some phoneticians who prefer to record everything, for language documentation purposes. I think that should be a separate task: you can’t have your cake and eat it too. But if you insist on recording everything, it is possible to run two recorders at the same time: one could run during the whole session, while the other is used to produce small audio files. You can also use special software to record your stimuli automatically on a computer (e.g. PsychoPy).
You can show a native speaker your stimuli one by one or not show them the stimule but ask them to pronounce a certain stimulus or its translation. I use presentations to collect all stimuli in a particular order without the risk of omissions.
Since each stimulus is recorded as a separate audiofile, it is possible to merge them into one file automatically and make an annotation in a Praat TextGrid (the same result can be achieved with the Concatenate recoverably
command in Praat). After this step, the user needs to do some annotation of her/his own. When the annotation part is finished, it is possible to extract the annotated parts to a table, where each annotated object is a row characterised by some features (stimulus, repetition, speaker, etc…). You can play the soundfile and view its oscilogram and spectrogram. Here is an example of such a file and instruction for doing it.
phonfieldwork
package in useThere are several ways to enter information about a list of stimuli into R:
c()
function you can create a vector of all words and store it in a variable my_stimuli
(you can choose any other name):.csv
file and read it into R using the read.csv()
function:.xls
or xlsx
file and read it into R using the read_xls
or read_xlsx
functions from the readxl
package. If the package readxl
is not installed on your computer, install it using install.packages("readxl")
When the list of stimuli is loaded into R, you can create a presentation for elicitation. It is important to define an output directory, so in the following example I use the getwd()
function, which returns the path to the current working directory. You can set any directory as your current one using the setwd()
function. It is also possible to provide a path to your intended output directory with output_dir
(e. g. “/home/user_name/…”). This command (unlike setwd()
) does not change your working directory.
create_presentation(stimuli = my_stimuli_df$stimuli,
output_file = "first_example",
output_dir = getwd())
As a result, a file “first_example.html” was created in the output folder. You can change the name of this file by changing the output_file
argument. The .html
file now looks as follows:
It is also possible to change the output format, using the output_format
argument. By dafault it is “html”, but you can also use “pptx” (this is a relatively new feature of rmarkdown
, so update the package in case you get errors). There is also an additional argument translations
, where you can provide translations for stimuli in order that they appeared near the stimuli on the slide.
After collecting data and removing soundfiles with unsuccesful elicitations, one could end up with the following structure: for each speaker s1
and s2
there is a folder that containes three audiofiles. Now let’s rename the files.
The rename_soundfiles()
function created a backup folder with all of the unrenamed files, and renamed all files using the prefix provided in the prefix
argument. There is an additional argument backup
that can be set to FALSE
(it is TRUE
by default), in case you are sure that the renaming function will work properly with your files and stimuli, and you do not need a backup of the unrenamed files.
rename_soundfiles(stimuli = my_stimuli_df$stimuli,
prefix = "s2_",
suffix = paste0("_", 1:3),
path = "s2/",
backup = FALSE)
The last command renamed the soundfiles in the s2
folder, adding the prefix s2
as in the previous example, and the suffix 1
-3
. On most operating systems it is impossible to create two files with the same name, so sometimes it can be useful to add some kind of index at the end of the files.
Sometimes it is useful to get information about sound duration:
It is also possible to analyze the whole folder:
For now phonfieldwork
works only with .wav(e)
and .mp3
sound files and several video formats (.mp4
, .avi
and .mov
).
After all the files are renamed, you can merge them into one. Remmber that sampling rate, resolution (bit), and number of channels should be the same across all recordings. It is possible to resample files with the resample()
function from biacoustics
.
This comand creates a new soundfile s1_all.wav
and an asociated Praat TextGrid s1_all.TextGrid
.
The resulting file can be parsed with Praat (subscripted t is the result of Praat’s conversion):
It is possible to annotate words using an existing annotation (since file concatination is made according to files sorted on the comuter I use the sort()
function in order to make correct annotation):
my_stimuli_df$stimuli
annotate_textgrid(annotation = sort(my_stimuli_df$stimuli),
textgrid = "s1/s1_all.TextGrid")
As you can see in the example, the annotate_textgrid()
function creates a backup of the tier and adds a new tier on top of the previous one. It is possible to prevent the function from doing so by setting the backup
argument to FALSE
.
Imagine that we are interested in annotation of vowels. The most common solution will be open Praat and create new annotations. But it is also possible to create them in advance using subannotations. The idea that you choose some baseline tier that later will be automatically cutted into smaller pieces on the other tier.
create_subannotation(textgrid = "s1/s1_all.TextGrid",
tier = 1, # this is a baseline tier
n_of_annotations = 3) # how many empty annotations per unit?
Now we can annotate created tier:
annotate_textgrid(annotation = c("", "æ", "", "", "ı", "", "", "ɒ", ""),
textgrid = "s1/s1_all.TextGrid",
tier = 3,
backup = FALSE)
You can see that we created a third tier with annotation. The only thing left is to move annotation boundaries in Praat (this can not be automated):
You can see from the last figure that no backup tier was created (backup = FALSE
), that the third tier was annotated (tier = 3
).
First, it is important to create a folder where all of the extracted files will be stored:
It is possible extract to extract all annotated files based on an annotation tier:
It is possible to view an oscilogram and spetrogram of any soundfile:
There are additional parameters:
title
– the title for the plotfrom
– time in seconds at which to start extractionto
– time in seconds at which to stop extractionzoom
– time in seconds for zooming spectrogramtext_size
– size of the text on the plotannotation
– the optional file with the TextGrid’s file path or dataframe with annotations (see the section 5.)frequency_range
– the frequency range to be displayed for the spectrogramdynamic_range
– values greater than this many dB below the maximum will be displayed in the same colorwindow_length
– the desired length in milliseconds for the analysis windowwindow
– window type (can be “rectangular”, “hann”, “hamming”, “cosine”, “bartlett”, “gaussian”, and “kaiser”)spectrum_info
– logical value, if FALSE
won’t print information about spectorgram on the right side of the plot.output_file
– the name of the output fileoutput_width
– the width of the deviceoutput_height
– the height of the deviceoutput_units
– the units in which height and width are given. This can be “px” (pixels, which is the default value), “in” (inches), “cm” or “mm”.It is really important in case you have a long file not to draw the whole file, since it won’t fit into the RAM of your computer. So you can use from
and to
arguments in order to plot the fragment of the sound and annotation:
It is also possible using the zoom
argument to show the part of the spectrogram keeping the broader oscilogram context:
If the output_file
argument is provided, R will save the plot in your directory instead of displaying it.
It is also possible to create visualizations of all sound files in a folder. For this purpose you need to specify a source folder with the argument sounds_from_folder
and a target folder for the images (pic_folder_name
). The new image folder is automatically created in the upper level folder, so that sound and image folders are on the same level in the tree structure of your directory.
It is also possible to use the argument textgrid_from_folder
in order to specify the folder where .TextGrids for annotation are (could be the same folder as the sound one). By default the draw_sound()
function with the sounds_from_folder
argument adds a title with the file name to each pictures’ title, but it is possible to turn it off using the argument title_as_filename = FALSE
.
If you are familiar with the Raven program for bioacoustics, you probably miss an ability to annotate not only time, but also a frequency range. In order to do it you need to create a dataframe with the columns time_start
, time_end
, freq_low
and freq_high
:
raven_an <- data.frame(time_start = 450,
time_end = 520,
freq_low = 3,
freq_high = 4.5)
draw_sound(system.file("extdata", "test.wav", package = "phonfieldwork"),
raven_annotation = raven_an)
It is also possible to use multiple values, colors (adding colors
column) and annotation (adding content
column):
The phonfieldwork
package provides also several methods for reading different file types into R. This makes it possible to analyze them and convert into .csv
files (e. g. using the write.csv()
function). The main advantage of using those functions is that all of them return data.frame
s with columns (time_start
, time_end
, content
and source
). This make it easer to use the result in the draw_sound()
function that make it possible to visualise all kind of sound annotation systems.
.TextGrid
from Praat (just change the system.file()
function to path to the file); see also rPraat
and textgRid
packages## id time_start time_end content tier source
## 1 1 0.00000000 0.01246583 1 test.TextGrid
## 6 1 0.00000000 0.01246583 2 test.TextGrid
## 2 2 0.01246583 0.24781914 t 1 test.TextGrid
## 7 2 0.01246583 0.24781914 2 test.TextGrid
## 11 1 0.01246583 0.01246583 t 3 test.TextGrid
## 3 3 0.24781914 0.39552363 e 1 test.TextGrid
## 8 3 0.24781914 0.39552363 2 test.TextGrid
## 12 2 0.24781914 0.24781914 e 3 test.TextGrid
## 4 4 0.39552363 0.51157715 s 1 test.TextGrid
## 9 4 0.39552363 0.51157715 2 test.TextGrid
## 13 3 0.39552363 0.39552363 s 3 test.TextGrid
## 5 5 0.51157715 0.65267574 t 1 test.TextGrid
## 10 5 0.51157715 0.65267574 2 test.TextGrid
## 14 4 0.51157715 0.51157715 t 3 test.TextGrid
It is possible to read multiple .TextGrid
files using the textgrids_from_folder
argument.
.eaf
from ELAN (just change the system.file()
function to path to the file); see also the FRelan package by Niko Partanen## tier id content tier_name tier_type time_start time_end source
## 9 1 1 intervals praat 0.000 0.012 test.eaf
## 10 2 1 empty_intervals praat 0.000 0.012 test.eaf
## 11 1 2 t intervals praat 0.012 0.248 test.eaf
## 12 2 2 C empty_intervals praat 0.012 0.248 test.eaf
## 1 1 3 e intervals praat 0.248 0.396 test.eaf
## 2 2 3 V empty_intervals praat 0.248 0.396 test.eaf
## 3 1 4 s intervals praat 0.396 0.512 test.eaf
## 4 2 4 C empty_intervals praat 0.396 0.512 test.eaf
## 5 1 5 t intervals praat 0.512 0.652 test.eaf
## 6 2 5 C empty_intervals praat 0.512 0.652 test.eaf
## 7 1 6 intervals praat 0.652 300.000 test.eaf
## 8 2 6 empty_intervals praat 0.652 300.000 test.eaf
It is possible to read multiple .eaf
files using the eafs_from_folder
argument.
.exb
from EXMARaLDA (just change the system.file()
function to path to the file)## tier id content tier_name tier_type tier_category tier_speaker time_start
## 3 1 1 t X [v] t v SPK0 0.06908955
## 1 1 2 e X [v] t v SPK0 0.24989836
## 5 1 3 s X [v] t v SPK0 0.38072750
## 7 1 4 t X [v] t v SPK0 0.40424735
## 4 2 1 C X [v] a v SPK0 0.06908955
## 2 2 2 V X [v] a v SPK0 0.24989836
## 6 2 3 C X [v] a v SPK0 0.38072750
## 8 2 4 C X [v] a v SPK0 0.40424735
## time_end source
## 3 0.2498984 test.exb
## 1 0.3807275 test.exb
## 5 0.4042473 test.exb
## 7 0.6526757 test.exb
## 4 0.2498984 test.exb
## 2 0.3807275 test.exb
## 6 0.4042473 test.exb
## 8 0.6526757 test.exb
It is possible to read multiple .exb
files using the exbs_from_folder
argument.
.srt
(just change the system.file()
function to path to the file)## id content time_start time_end source
## 0 1 t 0.013 0.248 test.srt
## 1 2 e 0.248 0.396 test.srt
## 2 3 s 0.396 0.512 test.srt
## 3 4 t 0.512 0.653 test.srt
.txt
from Audacity## time_start time_end content source
## 1 0.2319977 0.3953891 sssw test_audacity.txt
.flextext
from FLEx (that is actually is not connected with the main functionality of phonfieldwork
, but I’d like to have it):There is also an additional function for working with the .flextext
format that convert it to a glossed document in a docx
or .html
format (see examples: .docx
, .html
):
create_glossed_document(flextext = "files/zilo_test.flextext",
output_dir = ".") # you need to specify the path to the output folder
All those functions (tier_to_df()
, textgrid_to_df()
, eaf_to_df()
, exb_to_df()
, audacity_to_df()
, srt_to_df()
) except flextext_to_df()
can be used in order to visualise sound annotation:
Sound viewer (here is an example 1 and example 2) is a useful tool that combine together your annotations and make it searchable. It is also produce a ready to go .html
file that could be uploaded on the server (e. g. to Github Pages) and be availible for anyone in the world.
In order to create a sound viewer you need three things:
We will start with the previous folder structure:
We have all folders:
So what is left is the table. It is possible to create manually (or upload it form .csv or .xlsx files, see section 4.1):
This table could be used in order to create an annotation viewer:
create_viewer(audio_dir = "s1/s1_sounds/",
picture_dir = "s1/s1_pics/",
table = df,
output_dir = "s1/",
output_file = "stimuli_viewer")
As a result, a stimuli_viewer.html
was created in the s1
folder.
You can find the created example here.
Unfortunately, the way of table creation for the annotation viewer presented in this section is not a good solution for the huge amount of sounds. It is possible to derive such a table from annotation TextGrid, that we have created earlier. Here is a TextGrid:
So in order to create desired table we can use tier_to_df()
function:
t1 <- tier_to_df("s1/s1_all.TextGrid", tier = 1)
t1
t3 <- tier_to_df("s1/s1_all.TextGrid", tier = 3)
t3
As we see the first tier is ready, but the third tier contains empty annotations. Let’s remove them:
So from this point it is possible to create the table that we wanted:
So now we are ready to run our code for creating an annotation viewer:
create_viewer(audio_dir = "s1/s1_sounds/",
picture_dir = "s1/s1_pics/",
table = new_df,
output_dir = "s1/",
output_file = "stimuli_viewer")
By default sorting in the result annotation viewer will be according file names in the system, so if you want to have another default sorting you can specify column names that the result table should be sorted by using the sorting_columns
argument.
If you are familiar with my package lingtypology
Moroz (2017) for interactive linguistic map generation and API for typological databases, there is a good news for you: it is possible to connect those two pacakages creating an interactive map that share the same hear and view buttons. In order to do it you need
glottocode
column with language glottocodes from Glottolog Hammarström, Forkel, and Haspelmath (2020) to your dataframe with annotation details;lingtypology
with a command install.packages("lingtypology")
if you don’t have it installed;map = TRUE
argument to create_viewer()
function.I will add some glottocodes for Russian, Polish and Czech to the dataframe that we have already worked with (for those data it doesn’t make any sense, I just giving an example of usage):
new_df$glottocode <- c("russ1263", "poli1260", "czec1258")
create_viewer(audio_dir = "s1/s1_sounds/",
picture_dir = "s1/s1_pics/",
table = new_df,
output_dir = "s1/",
output_file = "stimuli_viewer2",
map = TRUE)
Here is the result file.
It is also possible to provide your own coordinates with latitude
and longitude
columns. In that case glottocode
column is optional.
Bořil, Tomáš, and Radek Skarnitzl. 2016. “Tools rPraat and mPraat.” In Text, Speech, and Dialogue: 19th International Conference, Tsd 2016, Brno, Czech Republic, September 12-16, 2016, Proceedings, edited by Petr Sojka, Aleš Horák, Ivan Kopeček, and Karel Pala, 367–74. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-45510-5_42.
Bowern, Claire. 2015. Linguistic Fieldwork: A Practical Guide. Springer.
Gordon, Matthew. 2003. “Collecting Phonetic Data on Endangered Languages.” In 15th International Congress of Phonetic Sciences, 207–10.
Hammarström, Harald, Robert Forkel, and Martin Haspelmath. 2020. “Glottolog 4.2.”
Lubbers, Mart, and Francisco Torreira. 2013. “Pympi-Ling: A Python Module for Processing ELANs EAF and Praats TextGrid Annotation Files.”
Moroz, George. 2017. Lingtypology: Easy Mapping for Linguistic Typology. https://CRAN.R-project.org/package=lingtypology.
Reidy, Patrick. 2016. TextgRid: Praat Textgrid Objects in R. https://CRAN.R-project.org/package=textgRid.
Schmidt, Thomas, and Kai Wörner. 2009. “EXMARaLDA–Creating, Analysing and Sharing Spoken Language Corpora for Pragmatic Research.” Pragmatics 19 (4): 565–82.
Wittenburg, Peter, Hennie Brugman, Albert Russel, Alex Klassmann, and Han Sloetjes. 2006. “ELAN: A Professional Framework for Multimodality Research.” In 5th International Conference on Language Resources and Evaluation (Lrec 2006), 1556–9.