yihwan says ...: Korean/English Cross-Linguistic Study

Summary of Study

For the past few weeks, I’ve been fortunate enough to work with Caitlin Fausey, a PhD candidate at the Stanford Psychology Department, and her study that compares the frequency of agentive versus non-agentive speech within English and Korean. To collect data for the study, a large sample of native Korean and English speakers were asked to watch a video-recording of accidental and purposeful events (i.e. a man popping a balloon on purpose and by accident, a man stepping on a can on purpose and by accident, etc.) and to describe what they saw.

The data collected from this process was then coded, with a “1” assigned to sentences that contained an agentive verb (i.e. he popped the balloon), a “2” assigned to sentences that contained a non-agentive verb (i.e. the balloon popped), or a “5” assigned to sentences that were un-codeable (i.e. completely irrelevant descriptions, sentences that did not contain a key verb, etc.). An initial joint meeting was facilitated between the two coders, and any disagreements in coding were resolved. Any outstanding disagreements were then resolved over email. A second joint meeting was conducted in order to begin analyzing and graphing the data collected during the experiment. The data was transferred from Excel to SLSS, where the data was consolidated and graphed. This data analysis examined how often Korean speakers used agentive language to describe accidental and non-accidental events and also identified the mean, potential outliers, and the significance factor of the experiment. A third and final meeting was conducted in order to compare these results with that of English speakers and identify any significant relationship between the two sets of data.

The data suggested that Korean speakers tended to use non-agentive speech more frequently to describe accidental events than their English speaking counterparts. Korean and English speakers used agentive speech to describe non-accidental speech with similar frequency.

The Difficulties of Coding

Although I considered myself fairly fluent in Korean prior to the start of this experiment, I realized that the Korean language was far more complicated than I had known. In the Korean sentences I coded, the test subjects rarely included the agent when describing the events they saw. Instead, they made the agentive/non-agentive part of their sentence by adding either a subject-marker or object-marker after the object being acted upon. This pattern was particularly evident in the “balloon-popping” scene:

풍선을 터트렸다.
(He) popped the balloon.
Demonstrates the use of the object-marker “eul.” In this sentence, the balloon is the object of an unknown (or rather implied) agent.

풍선이 터트렸다.
The balloon popped.
Demonstrates use of the subject-marker “ee.” In this sentence, the balloon is the subject.

Unfortunately, not all the data sets were this clear. Some sentences included a lot of extraneous information on the agent’s shirt color, what he was doing before he popped the balloon poten,tial reasons behind this seemingly senseless action, etc. Needless to say, it became progressively more difficult to find this single marker embedded within a sentence to determine whether the sentence represented agentive or non-agentive language.

Experiences Consolidating Data

Although I’d taken an introductory college-level psychology course prior to this study, I’d never used SPSS or Excel to analyze data first-hand. Therefore, I didn’t fully understand the concepts and processes we discussed during our meetings at first. But I gradually became familiar with the process of turning raw data into graphs to observe any general trends or patterns. One significant observation we made was that Korean speakers used non-agentive speech much more frequently than English speakers to describe accidental events, which supported Caitlin’s initial hypothesis.

One new thing I learned during this process was the importance of the “significance” factor, which estimates how likely the patterns represented in the data are just occurring by chance. In order to make a publishable claim, the significance factor should be less than .05, which suggests that there is a definite causal relationship within the data (in this case, how English and Korean speakers describe accidental events with either agentive/non-agentive speech). At first, the p-value was around .075, which was a little too high to make a definitive claim about the data. However, after two outliers out of a pool of 90 test subjects were eliminated (one English speaker who described accidental events with non-agentive language and one Korean speaker who described accidental events with agentive language with unusually high frequency), the p-value fell to .015, well below the magic number for publishing your results.

Future Prospects for this Study

One interesting trend I noticed while coding was that certain individuals tended to use either agentive speech or non-agentive speech more frequently when describing accidental events. In other words, the frequency of agentive speech may also depend on the test subject as well as the particular event being described. There seemed to be a loose correlation between the use of agentive speech and the particular form of speech used to describe the event. Korean is a language with multiple levels of formality (honorific, deferential, humble, polite, blunt, half-talk) that depend on the speaker’s relationship with the individuals being spoken to. Perhaps another step for this study would be to code each sentence into one of the 5 categories of speech mentioned below and to see whether or not there is any statistically significant pattern across the different levels of speech formality. However, as many of the sentences within the data set are incomplete, it would be difficult to accurately and consistently code each sentence into one of five categories. But if it could be done, one might be able to draw further conclusions on which level of speech lends itself to agentive or non-agentive speech.

Final Thoughts on the Study

Overall, I thought this study was a great opportunity for me to explore an entirely new aspect of language, one that focuses a quantitative versus qualitative scope of study. Coding the data allowed me to explore the intricacies of the Korean language, and I was definitely able to put all the grammatical concepts I’ve learned in my Korean language classes to use. The data analysis process provided a thorough refresher course in statistics, and I was able to see how statistics can be applied differently in various disciplines, particularly psychology. When I first applied for this introductory seminar, I mentioned that I wasn’t very interested in learning more about the technical aspects of language. However, after seeing how such details of language can be extrapolated and applied in a greater context, I’m happy to say that I am very much interested in learning more about the structure and syntax of various languages. I thoroughly enjoyed my experience working with Caitlin on her project, and I hope to become more active in research studies in the future.

yihwan says ...

Friday, June 4, 2010

Korean/English Cross-Linguistic Study - Thoughts and Reflections

No comments:

Post a Comment

Followers

Blog Archive

About Me