All Things Techie With Huge, Unstructured, Intuitive Leaps
Showing posts with label predictive analysis. Show all posts
Showing posts with label predictive analysis. Show all posts

Malaysia Airlines Flight 370 ~ My Analysis

(click on graphic for larger image)
Update:  I made this prediction four days before any authority did.

Here is my two cent analysis of the Malaysia Airlines Flight 370 mystery.  Point A on the graphic is where the transponders stopped working at cruise altitude.  Point B is where the military radar last saw Flight 370, still at cruising altitude in the Straits of Mallaca, near the tiny island of Pulau Perak.

There are two possibilities at point A.  The first is that the plane suffered severe mechanical malfunction that took out the transponders and the pilot turned the plane around.  The second possibility, is that the plane was hijacked either by the pilot with mental issues, or by terrorist act.  The transponders would be turned off either ordered by the terrorists or by the mad pilot.  I saw a Pentagon analyst say that if this was a terrorist attack, this could be dry run or a feasibility study to see if this would work in the United States.

The heading is way off course.  There are two possibilities again.  Either the pilot believed that he was headed back to Kuala Lumpur and was mistaken, or the course is deliberate.  It doesn't much matter until point B is reached.

If the pilot is flying a damaged plane, and he believes that he is on the course back to Kuala Lumpur, he will hold his course to point C until he sees the lights of the city.  Unfortunately he get to North Sumatra and there is no city.  He keeps flying until the plane falls out of the sky between points C and E.  If the pilot was not under coercion and sane, then he would realize that by point E that he had flown long enough to reach Kuala Lumpur and missed it.  Again in the same state, he should have realized that he was not over the Malay Peninsula and might have turned south to point D, when reaching point D should have realized that he missed Kuala Lumpur.  With the damaged plane, sane pilot scenario, the wreckage should be found in the orange circle, or at maximum in the pink circle or if there was a course change, in the yellow circle around C,D, and E.

The terrorist or mad pilot scenario is scarier.  If there was a terrorist or if the pilot went mad, there was enough fuel in the plane to take the flight out into the deep Indian Ocean to point F and past so that the plane would never be found or not found for a long time.  If the plane ditched around land, the water is relatively shallow compared to surrounding ocean. This same scenario would hold true if the plane de-pressurized killing all aboard and then continue to fly on auto-pilot like the case that killed golfer Payne Stewart.

So there you have it.  If there is a reasonable explanation, I would look in the Indonesian waters off North Sumatra for the wreckage.  If the loss of the plane defies rationality, if it is found, it will be somewhere off in the Indian Ocean.

Time will tell.

Super Bowl XLVIII Predictions From A Datamining Geek


First of all, let me say that these predictions will probably be somewhat wrong.  But what if they aren't?  These predictions are made without looking at the football performance of the two teams -- Seattle and Denver.  But rather they are made with an analysis of social media and datamining the web for crowd chatter.  There is the premise that the crowd is always right, and perhaps they will be on this one.  I like to call this technique "crowd mining".  So lets get to it.

My research and dubious math skill predict that going on crowd behavior alone, I predict that the winner will be the Seattle Seahawks.  After I crunched the numbers, I was disheartened to learn that Peyton Manning is the Broncos quarterback.  He is a formidable force and I am glad that I didn't recognize this fact before I started this exercise, otherwise it would have skewed my results.  And it may be the reason why these predictions could be wrong.

But what numbers does the crowd sentiment suggest:

31

24


Then I decided to apply some statistical normalization to the numbers, and I got the following result:


27

21


So, I am going to do something that I have never done before, and lay a bet on those two results.  Various pundits figure that over $100 million will be bet on this Super Bowl

As a further step in statistical analysis, you have to take into account fat tails, or random events.  Suppose that for some reason there is a blowout, an injury, or a team can't play well in the weather, or if the awesome Manning offence is brought down by the Seattle defense.  So, to cover the entire range of possibilities, the following are the complete fat tail results.  The left column is the Seahawks and the right column is the Broncos.  If these predictions are accurate and we get a weird game like a high scoring or a very low scoring game, here is the range of predicted results.

Seattle       Denver

3       to     0
7       to     3
10      to     7
13      to     10
17      to     13
21      to     14
24      to     17
27      to     20
27      to     21
30      to     23
31      to     24
33      to     26
35      to     27
35      to     28
36      to     29
37      to     30
38      to     35

It will be interesting to see how these geekazoid predictions turn out.

Regress to Success -- RugbyMetrics

So let's suppose that you run a rugby team in the Aviva Premiership or any other professional rugby club. So you haven't qualified for the Heineken Cup and your team is full of journeymen players and you consistently sit in the cellar of the standings table. And let's suppose that you don't have a Daddy Warbucks owner that can buy you a Dan Carter and you want to create a competitive team.

So what are you going to do? You have to find young untried players who will eventually turn into Thomas Waldrom, Schalk Brits, Chris Aston or Tom Wood. How are you going to identify them when they haven't had a chance to prove themselves and amass some statistics to prove that they have the stuff of the egg-chasing gods.

You turn to the geeks, that's how you do it. How so? You regress your way to success. You would use my RugbyMetrics tool (click on this LINK to see all of the articles on RugbyMetrics). Then you would take a game film of your targeted acquisition and using the tool, digitize that player's performance. From there you would use advanced statistics to create a mathematic model (using regression and Bayesian inference) to determine if your player has the right stuff.

How does it work? The seeds of athletic greatness are sown early. However they may not become manifest because the player is not on a team that enhances his skillset, or he is blindside oriented on a team that is predominantly openside oriented. There are many many reasons, however that player will demonstrate the subtle qualities that shows that he has the key performance indicators that tend to greatness.

So what are these KPI's or key performance indicators? They are a new set of statistics that are gleaned from data mining every aspect of the game. These are proprietary knowledge to the users of the system. But as a trivial example, one finds that an Olly Barkley will average x amounts of carries, gaining y amounts of yards, in a certain ratio to the opposition yards gained. This is objective, scientific knowledge of the game of rugby that comes from the field of predictive analytics.

So once you have the three mathematical formulas gleaned from going through mountains of statistics, you can eliminate the pretenders and give yourself a roster of possible stars. This is not meant to replace the years of coaching and scouting, but rather it is meant to give the teams a scientific, valid starting point when scouting for new team members.

The interesting aspect is that the front 8 will have different formulas than the back seven, and each position will have different regression parameters in the models. Also style of play comes into effect as well. If you like a Tom Wood style of play, you would determine the mathematical model by analyzing his performance and looking for players who have similar numbers to him. It sure beats the shot in the dark method of a player that "looks good".

If you have any questions, please leave a comment and I will answer them.