All Things Techie With Huge, Unstructured, Intuitive Leaps

Sentiment Analysis And Data Mining To Understand The World's Problems of Today


I was genuinely perplexed.  The world is a vastly different place than I envisioned it as a teenager. It seems that the continued enlightenment and scientific advancement in the years from post World War II to the turn of the millennium would bring the world into a less chaotic global village with a greater degree of peace, stability and economic well-being for man.  In many respects, the world has regressed.

Purely for my own understanding, I decided to try and figure out some reasons for the current problems of the world, using my skills in data mining.  I took twenty top international news sites, and by scraping their content with open source tools, I had a collection, a snapshot of the microcosm of the world today.  Encapsulated in that collection, would be a good starting point as a list of the major problems of the world.

To do some preliminary research into the world's problems, I decided to see what research was out there in the public domain. Eurobarometer had actually conducted a poll across the length and breadth of Europe, and came up with the following list of the top ten major world problems:

  • #10 Don't Know
  • #9 Proliferation Of Nuclear Weapons
  • Tied #7 Armed Conflict
  • Tied #7 Spread Of Infectious Disease
  • #6 The Increasing Global Population
  • #5 Availability Of Energy
  • #4 International Terrorism
  • #3 The Economic Situation
  • #2 Climate Change
  • #1 Poverty, Hunger And Lack Of Drinking Water

It is interesting that two percent of the people in Europe answered with "Don't Know".  This was the reason that I conducted this exercise in the first place.

After I had my collection of data from the news sources, I decided to do a bottom-up analysis of the news.  I tagged each story with a tag that generally summarized the theme of the story.  I had a lot of tags, and at that point, I needed to do some feature engineering by adding a layer of abstraction to the tags, so that the stories could be grouped for sameness.  I kept adding layers of abstraction until I got a manageable number of tags, and then did a bottom-up Naive Bayes classification of the tags.  The classifiers neatly categorized the stories.

I didn't just want a grocery list of the problems.  I was looking for something deeper. I was looking for answers related to the human condition, and how we, as a varied group of humans who inhabit this earth feel, react, create and possibly solve these problems.  So consequently, I created another layer of abstraction for a broad brush category of problems that condensed the list into a smaller but cogent set that related directly to the human condition.  Once I had the bottom up tag analysis done, I decided to do a top down, sentiment analysis of my problem tags.  It would be interesting to see how my analysis would fare with the Eurobarometer analysis.

Don't forget, my list came from the news sites, so it represents a snapshot of what was in the forefront on this particular current time period.  Here is my list of twelve issues:
  • Africa Issues
  • Alienation/Marginalization of peoples/societies/groups
  • Business Sector Wars/Competition
  • Economical Structural Change
  • Environment 
  • Globalization
  • Mass Media/Censorship/Subjectivity
  • Migrant Problems
  • Nationalism
  • Partisanship
  • Religious Fundamentalism/Jihadism/Religious Wars
  • Technology Frontiers/Problems
The differences between my list and the Eurobarometer list was apparent.  Africa was not on the list whereas it was represented as its own category in the news of late, and indirectly in the Migrant issues (although the migrant issues were a global phenomenon including the Caribbean where Haitians are fleeing their homeland causing problems in the neighboring countries).

In trying to understand the root cause, one of the surprising inclusions on my list, was Alienation/Marginalization of peoples/societies/groups. This included stories about gay rights, Kurdish struggles in other countries, Sunni versus Shia, Basques versus Spaniards etc.  

So how did my sentiment analysis turn out?  As it turns out, for my limited study, the environment is the number one issue in terms of global problems. Here is the list and the percentage of stories connected to the issues.
    
  • Environment -54.21%
  • Alienation/Marginalization of peoples/societies/groups   -23.29%
  • Mass Media/Censorship/Subjectivity   -8.48%
  • Migrant Problems   -3.92%
  • Globalization   -1.75%
  • Business Sector Wars   -1.71%
  • Technology Frontiers   -1.67%
  • Partisanship   -1.35%
  • Nationalism   -1.26%
  • Religious Fundamentalism   -1.03%
  • Africa Issues   -0.95%
  • Economical Structural Change   -0.36%
It seems that most conflicts in the world arise from the number two problem - Alienation /Marginalization of peoples/ societies/ groups.  This is probably the root cause of most social problems facing any area of the globe today.  Everyone wants and needs their own place in the sun, and others are trying to prevent them from having it, for a whole range of reasons.

People also seem to be concerned about their sources of information.  Right wing groups accuse the mainstream media of liberal bias.  Conservative news sites are mocked as Faux News. It seems that in the plethora of information sources, everyone has a hidden agenda, and folks are concerned about it. Objective information is very hard to find, with the democratization of information dissemination on the internet.

There is no need to further expound on migrant problems, which came in at number 4 on my list. It is hugely topical.

There are still worries about globalization, but it doesn't have the same impact as the people or environment related stories.

It is interesting that business and technology appear on the list of problems. Business has the general sentiment of being anti-humanistic and profit for profit's sake at the expense of the human condition. Technology is seen as a threat with artificial intelligence, killer robots and job destroyers.

The next two categories can be somewhat related - partisanship and nationalism.  They are both 'people-interacting in their countries' stories.  Partisanship is now rampant with gridlocked Congress versus the president, the Confederate flag issue and nationalism is seen in various venues around the globe where Scotland wants to exit the United Kingdom, Great Britain wants to exit the European Union, Basque and Catalonia want to exit from Spain, Quebec wanted to separate from Canada, ad infinitum. 

Religious fundamentalism is inexplicably rising. There seems to be a growing intolerance between mainstream and fundamentalism.  This is not only seen in the Muslim world, but also in the US where a city clerk refused to issue marriage licence to gays because of fundamentalism religious beliefs.  We have seen Baptists churches picketing the funerals of slain American soldiers from overseas, on religious grounds.  Who would have predicted this shift 30 years ago? I would be interested in knowing why there is a swing to fundamentalism in the modern world.  In broad brush strokes, this seems to be a struggle with progression versus regression and it is inexplicable to rational thought.

Africa is low on the list, but concerning.  Africa was the site of proxy wars between the superpowers in the last 60 years or more, and now there is currency collapse, armed conflict, epidemics, partisan in-fighting, loss of democracy and pretty much any social, economic or environmental ill that anyone can name.  Africa creates instability in the global village.

And bringing up the bottom of the list, is fundamental economic change.  Long term jobs are being replaced by the gig economy. Manufacturing is undergoing fundamental changes. The biggest profits are now from virtual paper transactions on Wall Street with the one-percenters who jerk the economy around with their financial derivatives and dark markets.

Certainly this exercise has opened the window and shed some light for me, but as usual, answers to these issues are elusive, complex and in many cases there are no apparent ones.  Life does seem to go on.

No comments:

Post a Comment