Everything is focused on the German nation, using words like "Deutschtum" ("Germanness", four times). In 1917, the cloud looks differently:
With this turn, he is positioning himself far way from extreme right wing philosophers like Oswald Spengler as well as from Marxist thinkers like Ernst Bloch. On both sides, the slogan "Ex oriente lux" was proclaimed. Weber has gone West instead. Co-occurences between some keywords evidence what that means.
In 1904, Weber had travelled in the US quite extensively. But this cannot be the turning point. First, among the observations in his letters from the States there are several critical comments about race discrimination. The Weber from 1895 would presumably have been more tolerant with similar traditions. He had witnessed the German treatment of Polish immigrants and inhabitants of Prussia without giving notice of moral considerations.
Secondly, having a look at the co-occurrences in the first part of “The Spirit of Capitalism and the Protestant Ethics”, the one Weber published before travelling to the States, we already notice something new:
The East is far less important, although America seemingly does not dominate yet. Or does it? Weber in this first part extensively discusses a text from Benjamin Franklin. Having added the name “Franklin” in the keyword list, indeed the overall impression changes with the most frequent co-occurences:In this first part of the “Protestant Ethics” the name of Benjamin Franklin appears ten times.
For example, the most frequent bigrams are:
> textstat_frequency(ngram_dfm, n=20)
feature frequency rank docfreq
1 geist_kapitalismus 14 1 1
2 modernen_kapitalismus 8 2 1
3 ganz_ebenso 7 3 1
4 benjamin_franklins 7 3 1
Furthermore, “Amerika” is named only once, but its derivations (KWIC search “amerika*”: “amerikanisch” …) nine times. Summing these occurrences up with those of “Franklin”, we see: Weber already is in America, without having travelled yet.
In the text, we also find a new geographic indication: “Okzident*” appears 43 times. Integrating “Okzident” into the list of countries, the panorama looks like this:
Other statistics
The differences in style between the speech from 1895 and the one from 1919 are not really overwhelming. With 31 tokens per sentence against 41, Weber’s sentences are shorter in 1919, but still rather long. The punctuation differs slightly. In 1895, we encounter a colon every 11 full stops, in 1919 every 5.6: The latter text could be rhetorically more adeguate for a spoken text. As for the lexical richness, both texts, being written by one and the same fully educated author, are quite similar, with a TTR of .26 in the shorter speech from 1895, and .23
in the other.
What really changes are the words most frequently used.
“Politics as vocation” (Ok, I could have stemmed them …):
> textstat_frequency(komplett_dfm, n=20)
feature frequency rank docfreq group
1 politik 70 1 1 all
2 politischen 58 2 1 all
3 macht 57 3 1 all
4 ganz 45 4 1 all
5 parteien 38 5 1 all
6 mittel 33 6 1 all
7 partei 31 7 1 all
8 politiker 29 8 1 all
9 ethik 28 9 1 all
10 politische 26 10 1 all
11 immer 25 11 1 all
12 sache 24 12 1 all
13 herrschaft 24 12 1 all
14 sinn 24 12 1 all
15 leben 24 12 1 all
16 rein 22 16 1 all
17 mehr 22 16 1 all
18 sagen 21 18 1 all
19 gerade 21 18 1 all
20 welt 21 18 1 all
Looking for bigrams:
> ngram_dfm <- dfm(toks_ngram)
> textstat_frequency(ngram_dfm, n=20)
feature frequency rank docfreq
1 politik_beruf 7 1 1
2 politik_treiben 5 2 1
3 örtlichen_honoratioren 5 2 1
4 politischen_macht 4 4 1
5 politik_leben 4 4 1
6 betrieb_politik 4 4 1
7 vereinigten_staaten 4 4 1
8 beruf_politik 4 4 1
9 bürgerlichen_parteien 4 4 1
10 spoil_system 4 4 1
In the „Antrittsrede“:
> textstat_frequency(komplett_dfm, n=20)
feature frequency rank docfreq group
1 nation 33 1 1 all
2 deutschen 24 2 1 all
3 unserer 23 3 1 all
4 ökonomischen 22 4 1 all
5 politischen 18 5 1 all
6 ökonomische 18 5 1 all
7 heute 16 7 1 all
8 gerade 15 8 1 all
9 eigenen 15 8 1 all
10 zukunft 14 10 1 all
11 politische 14 10 1 all
12 allein 13 12 1 all
13 arbeit 13 12 1 all
14 wissenschaft 13 12 1 all
15 sozialen 12 15 1 all
The presence of economy is quite evident here. Remember, Weber has begun his career as a Professor of Economics. In bi-grams:
> textstat_frequency(ngram_dfm, n=20)
feature frequency rank docfreq group
1 ökonomische_entwicklung 4 1 1 all
2 physischen_psychischen 3 2 1 all
3 letzter_linie 3 2 1 all
4 ökonomischen_sozialen 3 2 1 all
5 unserer_wissenschaft 3 2 1 all
6 frieden_menschenglück 3 2 1 all
7 machtinteressen_nation 3 2 1 all
8 leitung_nation 3 2 1 all
9 politi-_schen 3 2 1 all
10 deutschen_bürgertums 3 2 1 all
11 politischer_erziehung 3 2 1 all
12 1871-1885_zeigt 2 12 1 all
13 kampf_dasein 2 12 1 all
library(seededlda)
library(quanteda)
library(quanteda.textstats)
library(SnowballC)
library(readtext)
library("wordcloud")
library(koRpus)
library(stopwords)
library(scales)
library(tidyverse)
#Gesammelte Politische Schriften - Universität Potsdamhttps://verlagsarchivweb.ub.unipotsdam.
de
#following https://data.library.virginia.edu/reading-pdf-files-into-r-for-text-mining/
#data_char_HegelPhWallace <-
texts(readtext("https://www.gutenberg.org/files/39064/39064-0.txt"))
#names(data_char_HegelPhWallace) <- "Philosophy of Mind"
getwd()
#setwd("desktop/WeberR")
komplett_PS <- readtext("WeberPE1.txt", text_field = "texts")
names(komplett_PS)
length(komplett_PS)
summary(komplett_PS)
head(komplett_PS)
komplett_PS_c <- corpus(komplett_PS, text_field = "text")
names(komplett_PS_c)
tail(komplett_PS_c)
summary(komplett_PS_c)
head(komplett_PS_c)
ntoken(komplett_PS_c)%>%
sum()
ntype(komplett_PS_c)%>%
sum()
kontext1 <- kwic(quanteda::tokens(komplett_PS_c), "Franklin", case_insensitive = TRUE,
valuetype = "glob", window = 10)
kontext2 <- kwic(quanteda::tokens(komplett_PS_c), pattern= "Okzident*", case_insensitive
= TRUE, window=10)
kontext3 <- kwic(quanteda::tokens(komplett_PS_c), pattern= "amerika*", case_insensitive =
TRUE, window=10)
kontext4 <- kwic(quanteda::tokens(komplett_PS_c), pattern= "Rasse*", case_insensitive =
TRUE, window=10)
komplett_tokens <- quanteda::tokens(komplett_PS_c,
remove_punct = TRUE,
remove_symbols = FALSE,
remove_numbers = TRUE)%>%
tokens_select(min_nchar=4)%>%
tokens_remove(stopwords("de"))
head(komplett_tokens)
# Create a word cloud in red with min frequency of 20
dev.new(width=5, height=4, unit="in")
plot(1:20)
dev.new(width = 550, height = 330, unit = "px")
plot(1:15)
wordcloud(komplett_tokens, min.freq = 45, colors = "red",
scale = c(2,2.11),random.order = TRUE)
komplett_dfm <- dfm(komplett_tokens)
length(komplett_tokens)
length(unique(komplett_tokens))
textstat_frequency(komplett_dfm, n=20)
toks_ngram <- tokens_ngrams(komplett_tokens, n = 2:4)
ngram_dfm <- dfm(toks_ngram)
textstat_frequency(ngram_dfm, n=20)
#for (k in 3:17) {
#fund_lda <- textmodel_lda(komplett_dfm, k)
#print(terms(fund_lda, 10))
#}









Commenti
Posta un commento