compositional data - latent dimensions of religion …analyzing compositional data with r (vol....
TRANSCRIPT
![Page 1: Compositional Data - Latent Dimensions of Religion …Analyzing compositional data with R (Vol. 122). Heidelberg: Springer. THANK YOU Title PowerPoint Presentation Author SketchBubble.com](https://reader034.vdocuments.site/reader034/viewer/2022050411/5f882d58a240ad30643a8e8f/html5/thumbnails/1.jpg)
Latent Dimensions of Religion and Spirituality: A Longitudinal Correlated Topic Model
Seong-Hyeon (Sung) Kim1, Nathaniel R. Strenger2, & Narae Lee1
1Fuller Graduate School of Psychology, Pasadena, California, USA
2Pastoral Counseling Center, Dallas, Texas, USA
![Page 2: Compositional Data - Latent Dimensions of Religion …Analyzing compositional data with R (Vol. 122). Heidelberg: Springer. THANK YOU Title PowerPoint Presentation Author SketchBubble.com](https://reader034.vdocuments.site/reader034/viewer/2022050411/5f882d58a240ad30643a8e8f/html5/thumbnails/2.jpg)
Overview
• Religion & Spirituality (R/S) • Research Questions
• Topic models • Automated text analysis
• Topics: Latent dimensions of text
• Topic proportions as compositional data
• Ternary diagrams
• Topic correlations
![Page 3: Compositional Data - Latent Dimensions of Religion …Analyzing compositional data with R (Vol. 122). Heidelberg: Springer. THANK YOU Title PowerPoint Presentation Author SketchBubble.com](https://reader034.vdocuments.site/reader034/viewer/2022050411/5f882d58a240ad30643a8e8f/html5/thumbnails/3.jpg)
Religion & Spirituality (R/S)
• Definitions
• Religion: “the search for significance that occurs within the context of established institutions that are designed to facilitate spirituality” (Pargament et al., 2013, p. 15).
• Spirituality: “the search for the sacred” (Pargament et al., 2013, p. 14).
Pargament, K. I., Mahoney, A., Exline, J. J., Jones, J. W., & Shafranske, E. P. (2013). Envisioning an integrative paradigm for the psychology of religion and spirituality. In K. I. Pargament, J. J. Exline, & J. W. Jones (Eds.), APA handbook of psychology, religion, and spirituality (Vol 1): Context, theory, and research (pp. 3–19). Washington, DC: American Psychological Association. https://doi.org/10.1037/14045-001
![Page 4: Compositional Data - Latent Dimensions of Religion …Analyzing compositional data with R (Vol. 122). Heidelberg: Springer. THANK YOU Title PowerPoint Presentation Author SketchBubble.com](https://reader034.vdocuments.site/reader034/viewer/2022050411/5f882d58a240ad30643a8e8f/html5/thumbnails/4.jpg)
Religion & Spirituality (R/S)
• Gorsuch (1984) introduced factor analysis as a tool to investigate the dimension of R/S. • He had criticized the over-supply of R/S measures.
• Our research introduces topic modeling as a tool to identify the fundamental dimensions or building blocks of R/S that had been conceptualized in the R/S measures.
Gorsuch, R. L. (1984). Measurement: The boon and bane of investigating religion. American Psychologist, 39(3), 228–236. https://doi.org/10.1037/0003-066X.39.3.228
![Page 5: Compositional Data - Latent Dimensions of Religion …Analyzing compositional data with R (Vol. 122). Heidelberg: Springer. THANK YOU Title PowerPoint Presentation Author SketchBubble.com](https://reader034.vdocuments.site/reader034/viewer/2022050411/5f882d58a240ad30643a8e8f/html5/thumbnails/5.jpg)
Automated Text Analysis
• Quantitative (NOT qualitative) text analysis
• Three Different Types
1. Dictionary method: Pre-defined set of categories
2. Supervised learning: Outcome categories known (e.g., spam mail sorting)
3. Unsupervised learning: e.g., topic modeling (outcome categories unknown)
![Page 6: Compositional Data - Latent Dimensions of Religion …Analyzing compositional data with R (Vol. 122). Heidelberg: Springer. THANK YOU Title PowerPoint Presentation Author SketchBubble.com](https://reader034.vdocuments.site/reader034/viewer/2022050411/5f882d58a240ad30643a8e8f/html5/thumbnails/6.jpg)
Topic Modeling
• Identify topics, the latent dimensions, in the text data
• Machine (statistical) learning + computer science + statistics
• Latent Dirichlet Allocation (LDA; Blei, Ng, & Jordan, 2003): Basic and popular, but does not allow topic correlations
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3(Jan), 993–1022.
![Page 7: Compositional Data - Latent Dimensions of Religion …Analyzing compositional data with R (Vol. 122). Heidelberg: Springer. THANK YOU Title PowerPoint Presentation Author SketchBubble.com](https://reader034.vdocuments.site/reader034/viewer/2022050411/5f882d58a240ad30643a8e8f/html5/thumbnails/7.jpg)
TASA Corpus: 37,000 Texts & 300 Topics
Steyvers, M., & Griffiths, T. (2007). Probabilistic topic models. Handbook of latent semantic analysis, 427(7), 424-440.
![Page 8: Compositional Data - Latent Dimensions of Religion …Analyzing compositional data with R (Vol. 122). Heidelberg: Springer. THANK YOU Title PowerPoint Presentation Author SketchBubble.com](https://reader034.vdocuments.site/reader034/viewer/2022050411/5f882d58a240ad30643a8e8f/html5/thumbnails/8.jpg)
Example: Steyvers & Griffiths (2007)
• 2 topics
• Each gives approximately equal probability to
• Topic 1: “money,” “loan,” and “bank”
• Topic 2: “river,” “stream,” and “bank”
• 16 documents were created by arbitrarily mixing the two topics
• Let’s analyze this collection of documents with LDA (Blei et al., 2003)
. Steyvers, M., & Griffiths, T. (2007). Probabilistic topic models. In T. Landauer, D. S. McNamara, S. Dennis, & W. Kintsch (Eds.), Handbook of Latent Semantic Analysis (pp.424-440). Hillsdale, NJ: Erlbaum.
![Page 9: Compositional Data - Latent Dimensions of Religion …Analyzing compositional data with R (Vol. 122). Heidelberg: Springer. THANK YOU Title PowerPoint Presentation Author SketchBubble.com](https://reader034.vdocuments.site/reader034/viewer/2022050411/5f882d58a240ad30643a8e8f/html5/thumbnails/9.jpg)
Steyvers & Griffiths (2007)
![Page 10: Compositional Data - Latent Dimensions of Religion …Analyzing compositional data with R (Vol. 122). Heidelberg: Springer. THANK YOU Title PowerPoint Presentation Author SketchBubble.com](https://reader034.vdocuments.site/reader034/viewer/2022050411/5f882d58a240ad30643a8e8f/html5/thumbnails/10.jpg)
Example: 16 Documents
![Page 11: Compositional Data - Latent Dimensions of Religion …Analyzing compositional data with R (Vol. 122). Heidelberg: Springer. THANK YOU Title PowerPoint Presentation Author SketchBubble.com](https://reader034.vdocuments.site/reader034/viewer/2022050411/5f882d58a240ad30643a8e8f/html5/thumbnails/11.jpg)
Term Distributions for Topics
Topic 1
Word Probability
bank .390
money .314
loan .287
river .009
stream .000
Topic 2
Word Probability
stream .391
bank .345
river .240
money .012
loan .012
![Page 12: Compositional Data - Latent Dimensions of Religion …Analyzing compositional data with R (Vol. 122). Heidelberg: Springer. THANK YOU Title PowerPoint Presentation Author SketchBubble.com](https://reader034.vdocuments.site/reader034/viewer/2022050411/5f882d58a240ad30643a8e8f/html5/thumbnails/12.jpg)
Topic Distribution for Documents
![Page 13: Compositional Data - Latent Dimensions of Religion …Analyzing compositional data with R (Vol. 122). Heidelberg: Springer. THANK YOU Title PowerPoint Presentation Author SketchBubble.com](https://reader034.vdocuments.site/reader034/viewer/2022050411/5f882d58a240ad30643a8e8f/html5/thumbnails/13.jpg)
Matrix Factorization
Steyvers, M., & Griffiths, T. (2007). Probabilistic topic models. Handbook of latent semantic analysis, 427(7), 424-440.
![Page 14: Compositional Data - Latent Dimensions of Religion …Analyzing compositional data with R (Vol. 122). Heidelberg: Springer. THANK YOU Title PowerPoint Presentation Author SketchBubble.com](https://reader034.vdocuments.site/reader034/viewer/2022050411/5f882d58a240ad30643a8e8f/html5/thumbnails/14.jpg)
LDA & Beyond
• Limitations of LDA
• Fails to model correlation between topics
• Stems from the implicit independence assumption in the Dirichlet distribution on the topic proportions in documents
• Topics are usually correlated in texts.
![Page 15: Compositional Data - Latent Dimensions of Religion …Analyzing compositional data with R (Vol. 122). Heidelberg: Springer. THANK YOU Title PowerPoint Presentation Author SketchBubble.com](https://reader034.vdocuments.site/reader034/viewer/2022050411/5f882d58a240ad30643a8e8f/html5/thumbnails/15.jpg)
LDA & Beyond
• Correlated Topic Model (CTM, Blei & Lafferty, 2007)
• Replaces the Dirichlet in LDA with “more flexible logistic normal distribution” (p. 19).
• This paper cites Aitchison & Shen (1980), Aitchison (1982), & Aitchison (1985).
Blei, D. M., & Lafferty, J. D. (2007). A correlated topic model of science. The Annals of Applied Statistics, 1(1), 17–35. https://doi.org/10.1214/07-AOAS114 Aitchison, J. (1982). The statistical analysis of compositional data. Journal of the Royal Statistical Society. Series B (Methodological), 44(2), 139–177. Aitchison, J. (1985). A general class of distributions on the simplex. Journal of the Royal Statistical Society. Series B (Methodological), 47(1), 136-146. Atchison, J., & Shen, S. M. (1980). Logistic-normal distributions: Some properties and uses. Biometrika, 67(2), 261-272.
![Page 16: Compositional Data - Latent Dimensions of Religion …Analyzing compositional data with R (Vol. 122). Heidelberg: Springer. THANK YOU Title PowerPoint Presentation Author SketchBubble.com](https://reader034.vdocuments.site/reader034/viewer/2022050411/5f882d58a240ad30643a8e8f/html5/thumbnails/16.jpg)
Structural Topic Model (STM)
• Our research used STM based on CTM
• Allows topic correlations
• Allows covariates (i.e., predictors of topic proportions)
• We collected 255 R/S measures published from 1929 and 2016 to identify the latent dimensions of text.
![Page 17: Compositional Data - Latent Dimensions of Religion …Analyzing compositional data with R (Vol. 122). Heidelberg: Springer. THANK YOU Title PowerPoint Presentation Author SketchBubble.com](https://reader034.vdocuments.site/reader034/viewer/2022050411/5f882d58a240ad30643a8e8f/html5/thumbnails/17.jpg)
Atkins, D. C., Rubin, T. N., Steyvers, M., Doeden, M. A., Baucom, B. R., & Christensen, A. (2012). Topic Models: A Novel Method for Modeling Couple and Family Text Data. Journal of Family Psychology, 26, 816-27. doi: 10.1037/a0029607
![Page 18: Compositional Data - Latent Dimensions of Religion …Analyzing compositional data with R (Vol. 122). Heidelberg: Springer. THANK YOU Title PowerPoint Presentation Author SketchBubble.com](https://reader034.vdocuments.site/reader034/viewer/2022050411/5f882d58a240ad30643a8e8f/html5/thumbnails/18.jpg)
Preprocessing
• R ‘tm’ package (Feinerer & Hornik, 2017)
• Items of 255 R/S measures
• Preprocessed texts
• Removed stop words, numbers, and punctuations.
• e.g., a/an, the, to, for, at, she/he, I, ., or ?.
• Lemmatized words
• e.g., educate, educated, or educating educate
Feinerer, I. & Hornik, K. (2015). tm: Text Mining Package (Version 0.6-2) [Computer software]. Retrieved from https://CRAN.R-project.org/package=tm.
![Page 19: Compositional Data - Latent Dimensions of Religion …Analyzing compositional data with R (Vol. 122). Heidelberg: Springer. THANK YOU Title PowerPoint Presentation Author SketchBubble.com](https://reader034.vdocuments.site/reader034/viewer/2022050411/5f882d58a240ad30643a8e8f/html5/thumbnails/19.jpg)
Preprocessing
• Created a document-term matrix
• Dimensions: 255 × 5617
• Included
• unigrams
• bigrams (e.g., Jesus Christ)
• trigrams (e.g., religious (and/or) spiritual belief)
• Deleted low-frequency terms (< 3)
![Page 20: Compositional Data - Latent Dimensions of Religion …Analyzing compositional data with R (Vol. 122). Heidelberg: Springer. THANK YOU Title PowerPoint Presentation Author SketchBubble.com](https://reader034.vdocuments.site/reader034/viewer/2022050411/5f882d58a240ad30643a8e8f/html5/thumbnails/20.jpg)
Model Estimation
• R ‘stm’ package (Roberts, Stewart, & Tingley, 2017)
• Topics
• Latent dimensions of text data
• Comparable to principal components or factors
• Estimated based on word co-occurrences across documents
• Structural topic modeling
• Estimate covariates’ effect on topic proportions
• Current analysis: Decade of publication as a predictor 1950’s through 2010’s
Roberts, M. E., Stewart, B. M., & Tingley, D. (2016). stm: R Package for Structural Topic Models (Version 1.1.3) [Computer software]. Retrieved from http://www.structuraltopicmodel.com
![Page 21: Compositional Data - Latent Dimensions of Religion …Analyzing compositional data with R (Vol. 122). Heidelberg: Springer. THANK YOU Title PowerPoint Presentation Author SketchBubble.com](https://reader034.vdocuments.site/reader034/viewer/2022050411/5f882d58a240ad30643a8e8f/html5/thumbnails/21.jpg)
Top 50 Frequent Terms
![Page 22: Compositional Data - Latent Dimensions of Religion …Analyzing compositional data with R (Vol. 122). Heidelberg: Springer. THANK YOU Title PowerPoint Presentation Author SketchBubble.com](https://reader034.vdocuments.site/reader034/viewer/2022050411/5f882d58a240ad30643a8e8f/html5/thumbnails/22.jpg)
Diagnostic Indexes
![Page 23: Compositional Data - Latent Dimensions of Religion …Analyzing compositional data with R (Vol. 122). Heidelberg: Springer. THANK YOU Title PowerPoint Presentation Author SketchBubble.com](https://reader034.vdocuments.site/reader034/viewer/2022050411/5f882d58a240ad30643a8e8f/html5/thumbnails/23.jpg)
3 Topics Identified
• Topic 1: Spirituality spirituality, spiritual belief, religious spiritual, wilderness, never experience, spiritual experience, connect, illness, transcendent, transcendent spiritual
• Topic 2: Religion
church member, loving, teaching church, dealing, dealing life, local religious, join, local religious group, question meaning life, religious denomination
• Topic 3: Judeo-Christianity christian, allah, miracle, god will, god god, punish, client, god feel, patient, writing
![Page 24: Compositional Data - Latent Dimensions of Religion …Analyzing compositional data with R (Vol. 122). Heidelberg: Springer. THANK YOU Title PowerPoint Presentation Author SketchBubble.com](https://reader034.vdocuments.site/reader034/viewer/2022050411/5f882d58a240ad30643a8e8f/html5/thumbnails/24.jpg)
The estimated regression lines and their 95% confidence intervals are plotted.
Longitudinal Change of Expected Topic Proportions from 1950’s to 2010’s
![Page 25: Compositional Data - Latent Dimensions of Religion …Analyzing compositional data with R (Vol. 122). Heidelberg: Springer. THANK YOU Title PowerPoint Presentation Author SketchBubble.com](https://reader034.vdocuments.site/reader034/viewer/2022050411/5f882d58a240ad30643a8e8f/html5/thumbnails/25.jpg)
Created using R ‘compositions’ package (van der Boogaart, Tolosana, & Bren, 2015)
Van den Boogaart, K. G., Tolosana, R. & Bren, M. (2015). compositions: R Package for Compositional Data Analysis (Version 1.40-1) [Computer software]. Retrieved from https://cran.r-project.org/web/packages/compositions/index.html
![Page 26: Compositional Data - Latent Dimensions of Religion …Analyzing compositional data with R (Vol. 122). Heidelberg: Springer. THANK YOU Title PowerPoint Presentation Author SketchBubble.com](https://reader034.vdocuments.site/reader034/viewer/2022050411/5f882d58a240ad30643a8e8f/html5/thumbnails/26.jpg)
Normal Distribution on the Simplex
![Page 27: Compositional Data - Latent Dimensions of Religion …Analyzing compositional data with R (Vol. 122). Heidelberg: Springer. THANK YOU Title PowerPoint Presentation Author SketchBubble.com](https://reader034.vdocuments.site/reader034/viewer/2022050411/5f882d58a240ad30643a8e8f/html5/thumbnails/27.jpg)
Topic Correlations
1. exp(-var(z)): Buccianti & Pawlowsky-Glahn (2005)
• Z = ilr transformed parts
• 0 (1) → low (high) variability of ratios between parts
• e.g., .0016 for Topics 1 and 2
2. exp(-τ2/2): van den Boogaart & Tolosano-Delgado (2013)
• τ: Variation
• Interpret this as a correlation coefficient
• Very small between topics
Buccianti, A., & Pawlowsky-Glahn, V. (2005). New perspectives on water chemistry and compositional data analysis. Mathematical Geology, 37(7), 703-727. Van den Boogaart, K. G., & Tolosana-Delgado, R. (2013). Analyzing compositional data with R (Vol. 122). Heidelberg: Springer.
![Page 28: Compositional Data - Latent Dimensions of Religion …Analyzing compositional data with R (Vol. 122). Heidelberg: Springer. THANK YOU Title PowerPoint Presentation Author SketchBubble.com](https://reader034.vdocuments.site/reader034/viewer/2022050411/5f882d58a240ad30643a8e8f/html5/thumbnails/28.jpg)
THANK YOU