All Things Techie With Huge, Unstructured, Intuitive Leaps

A Different Kind of Software -- Happiness


This is not technology related, but rather technology of the mind.  It is a different kind of state machine in the brain, and the state is happiness.  Pictures of the boreal forest make me happy. I love the great outdoors, and the wilderness, and I had to take a break from the bits and bytes to reboot my brain.  This sort of stuff also debugs my life.

Now back to regularly scheduled programming.

Super Bowl XLVIII Predictions From A Datamining Geek


First of all, let me say that these predictions will probably be somewhat wrong.  But what if they aren't?  These predictions are made without looking at the football performance of the two teams -- Seattle and Denver.  But rather they are made with an analysis of social media and datamining the web for crowd chatter.  There is the premise that the crowd is always right, and perhaps they will be on this one.  I like to call this technique "crowd mining".  So lets get to it.

My research and dubious math skill predict that going on crowd behavior alone, I predict that the winner will be the Seattle Seahawks.  After I crunched the numbers, I was disheartened to learn that Peyton Manning is the Broncos quarterback.  He is a formidable force and I am glad that I didn't recognize this fact before I started this exercise, otherwise it would have skewed my results.  And it may be the reason why these predictions could be wrong.

But what numbers does the crowd sentiment suggest:

31

24


Then I decided to apply some statistical normalization to the numbers, and I got the following result:


27

21


So, I am going to do something that I have never done before, and lay a bet on those two results.  Various pundits figure that over $100 million will be bet on this Super Bowl

As a further step in statistical analysis, you have to take into account fat tails, or random events.  Suppose that for some reason there is a blowout, an injury, or a team can't play well in the weather, or if the awesome Manning offence is brought down by the Seattle defense.  So, to cover the entire range of possibilities, the following are the complete fat tail results.  The left column is the Seahawks and the right column is the Broncos.  If these predictions are accurate and we get a weird game like a high scoring or a very low scoring game, here is the range of predicted results.

Seattle       Denver

3       to     0
7       to     3
10      to     7
13      to     10
17      to     13
21      to     14
24      to     17
27      to     20
27      to     21
30      to     23
31      to     24
33      to     26
35      to     27
35      to     28
36      to     29
37      to     30
38      to     35

It will be interesting to see how these geekazoid predictions turn out.

More Chrome Grief


More Google Chrome browser grief.  I merely refreshed a Yahoo mail page and Google Chrome crashed.  Was it Yahoo's fault or Google's?  Who knows.  This is what the message said:

Unhandled exception at 0x77de15de in chrome.exe: 0xC0000374: A heap has been corrupted.

Then it asked me if I wanted to debug it.  Since I have a debugger on board, I said sure.  Here is what the trail looks like:

77DE15B3  nop              
77DE15B4  mov         eax,12Eh 
77DE15B9  xor         ecx,ecx 
77DE15BB  lea         edx,[esp+4] 
77DE15BF  call        dword ptr fs:[0C0h] 
77DE15C6  add         esp,4 
77DE15C9  ret         18h  
77DE15CC  mov         eax,12Fh 
77DE15D1  xor         ecx,ecx 
77DE15D3  lea         edx,[esp+4] 
77DE15D7  call        dword ptr fs:[0C0h] 
77DE15DE  add         esp,4 

And it crapped out at 77DE15DE.  Here it is adding esp a stack pointer to 4.  A corrupted heap means several things.  One could be that the stack pointer has been deallocated, or destroyed, or not set.  Whatever the cancer in the stack was, it caused the whole thing to blow up.

The ironic thing is that the exception was not handled.  This means that there was no accounting for error in the code.  At a high level, this is sloppy coding.  As previously iterated, Chrome is starting to lose its shine.  It turns out that the Google programming machine is not infallible as they like to believe.

Android - Programmatically Determine if Device is a Tablet




Here is some source code to determine if the Android device is a tablet or a phone:


public boolean isTablet() { 
    try { 
        // Compute screen size 
    Context context = (put the class name of your Activity here eg MyActivity).this;
        DisplayMetrics dm = context.getResources().getDisplayMetrics()
        float screenWidth  = dm.widthPixels / dm.xdpi
        float screenHeight = dm.heightPixels / dm.ydpi
        double size = Math.sqrt(Math.pow(screenWidth, 2) + 
                                Math.pow(screenHeight, 2)); 
        // Tablet devices have a screen size greater than 6 inches 
        return size >= 6; 
    } catch(Throwable t) { 
        Log.e("Failed to compute screen size", t.toString()); 
        return false; 
    } 
}





Converting .wmv to .mp4 for free, online


I had a .wmv Windows Movie File that I wanted to convert to the .mp4 so that a Java platform would play it with ease.  I have a bunch of video tools on my computer, so I decided to check them out first.

VLC Media player looked like it could do the job.  Using it was one of the most frustrating exercises that I have ever undertaken and it burned most of the day.

It looks easy enough.  You choose the Convert/Save menu item.  Then you open the file, put in the destination file, and choose what codecs you have.  That was the issue.  I chose the first one (some sort of video + MP3) and nothing happened.  No error message.  Nada.

I then clicked on that menu item and tried every single damn codec listed.  In some cases I got the audio but no video.  For most of the times, I clicked OK and nothing happened.

I finally got a movie out of one of the codecs, but it was incredibly poor quality with huge pixelation.  Another one gave me a good quality, but it consistently put in a bunch of junk (huge color blocks, pixels and bits of screen detritus) at the start.  I knew something was wrong, because in each effort, I never got a thumbnail of a still from the movie.

In frustration, I did a Google search for online .wmv to .mp4 converter.  The first one that popped up was www.online-convert.com.  I went to the site, selected my video and within a minute it was ready to download the .mp4.  The conversion was fast, and the result was spectacular.  There was a thumbnail of the very first frame.

These guys should have their blood preserved for posterity. Thanks guys.   (as usual, I was not paid to say this, nor did I receive any consideration at all).

Malware Spam Says That Your Mailbox Has Reached Its Limit

I am seeing a new kind of spam lately.  The sender says that it is "Email Administrator" and the subject is "Notification Alert".  The message tells me that I have exceeded my limit of my mailbox, and I need to click a link to fix it.  Here is the text of the message:

Email Administrator <administrateur@rdp.com>
12:05 PM (2 hours ago)


Dear Account User,

Your mailbox has exceeded the limit of 30 GB, which is as set by your manager, you are currently at 30.9GB, very soon you will not be able to create new e-mail to send or receive again until you validate your mailbox.To re-validate your mailbox, click on the link below and follow the instruction for your upgrade.

Click Here To Upgrade


Regards,

Email  Administrator Member Services
****************************** **************************
If you received this in Spam, please kindly move it to inbox.

You notice that the administrator is using the French spelling.  That would suggest that the perpetrator is from a French speaking country in Africa, like Senegal, Algeria, Burkina Faso or any other former French colony.

The other thing to note, is that they are using an email address not from the same domain.  That should raise red flags.

What I did do, is investigate the links where you have to click to load the malware.  (Don't try this at home folks.  I am a professional, and I have a machine that I can trash.  I use it to trap viruses and have a look at them).

There are two domains in the various messages.  Here are the domains:

www.ayotec.co.uk
www.kleine-bucher.de

These are legitimate domains.  What these guys do, is hack websites that are not updated regularly or the source code for the web sites are checked infrequently.  They simply add another landing page that isn't visible to the website with a link, but can be reached directly with a URL.  They hide their malware there, and as a result, the originator is untraceable.

Just another day in the war on Malware and Spam.

To Catch A Sexual Predator -- Morph Software for Police Work

I think that I have come up with a new crime-fighting technique.  It is the use of morphing software for police sketches.  Let me explain.

The police in Ottawa Canada have a problem.  There is a sexual predator loose in the city.  He chokes women as he sexually attacks them.  He hasn't killed yet, but his escalation of attacks might cause him to do so.

There are two police sketches of him.  As this serial sexual predator attacks, the police artists make a sketch.  Recently the police realized that the two sketches were the same person.  They published them in the online paper, looking for help from the public.  After all, when serial murderer and serial killer Paul Bernardo, the husband of killer Karla Homolka was just the Scarborough rapist, the police artists created a sketch.  It was on the bulletin board where Bernardo worked as an accountant trainee, and someone had penciled the word "Paul" on it because the sketch looked like him.  They thought that it was a joke.

The paper published sketch 1.  It looks like this:


The second police artist sketch of a second attack looked like this:
Even though they don't look alike at all, there are some common features. There is the thick eyebrows, the thick lips, and an exposed forehead.  This is what the scared victims of the sexual assaults remember the perpetrator by.

I got to thinking, that perhaps I could combine the two with morphing software.  I downloaded the photographs from the online newspaper and fed them into the morphing software.  This software has many features.

The first rendition was a 50/50 mix of the two sketches.  It looks like this:
Then I re-did the morph with the supposition that perhaps the first sketch was more accurate.  This was the source wrap, and the source wrap morph came out this way:
The next logical step was that perhaps the target or sketch 2 was more accurate.  This is called the Target Wrap and it looked like this:
The next two renditions were half way renditions from sketch 1 and sketch two to the 50-50 composite.  The first one is half way between sketch 1 and the 50-50 merge, and it looks like this:
The second is half way from the target sketch 2 and it looks like this:
Needless to say, the police investigator now has these renditions in his possession and I will keep the blog informed on the developments and how this exercise turned out in crime-fighting.

The Future of Big Data


The Future of Big Data

A brand new type of derivative of the future that will be traded like options and stocks.

Financial institutions like Goldman Sachs and others have a penchant for doing business solely to make money for the sake of making money.  They were the ones that pushed toxic collaterized debt obligations into the banking system -- essentially a risky type of security that caused the economic meltdown of 2008 in the sub-prime mortgage field.

There are other types of so-called securities or derivatives that bankers love to bet on and make money.  Blythe Masters, a banker at JPMorgan Chase invented the credit default swap.  It is essentially a derivative, synthetic or derived investment instrument used for hedging loans.  The way that it works is that when a loan is given, a buyer can buy a type of a bet that the loan will not default.  As long as the loan doesn't default, the buyer pays a premium to the seller of the CDS.  If the loan defaults, the seller must pay the value of the loan. You do not have to own the loan to participate.  It is like one huge casino between financial institutions.  Huge amounts swing on small margins.


  Other financial derivatives are puts and calls which are a bet on whether a security (stock) will go up or down, and of course options are derivatives.  These derivatives are traded like commodities in a secondary market -- meaning they are not connected to their underlying stock or companies.

The inventor of the CDS or credit default swap has been called the woman who invented financial weapons of mass destruction.  The reason for this is that big money swings in the balance should an "event" occur which triggers a payout.  Credit default swaps can be bought and sold as well as other derivatives.

The first to market with any type of new financial instrument is the big winner.  Financial institutions cannot resist the urge to make money at any and every opportunity, and there is a big opportunity opening up.

 I am here to predict the next type of derivative that will hit the market.  It will be the BDD or Big Data Derivative.  Big Data is defined as huge amounts of data that corporations generate.  It can include machine-generated data, sales data, manufacturing data, personnel data, web visit data or any other kind of information stored.

Until the advent of fast processors, servers with almost unlimited capacity and bandwidth to kill, processing this data was almost impossible.  Now there is a virtual deluge of data, and it can be extremely valuable.  The cycle works like this:  Data is mined for information.  Information is integrated into knowledge.  Knowledge is used to generate money.

Data mining is a burgeoning field, and although there are formalized methodologies to do it, it is still a wide open field and anyone who is knowledgeable in statistics and higher math can generate algorithms and formulae to tease valuable knowledge out of the data for profit.  It is just like financial institutions develop proprietary algorithms for computer trading.  Companies develop proprietary data mining algorithms.

I predict that Big Data will be a commodity.  It will be treated like precious metal ore.  A company can choose to mine and refine for their own revenue streams, or I predict that Big Data will be traded like a commodity such as copper, cotton or pork bellies.  The Big Data from one company can be valuable to a whole host of other companies.  Even mundane data like machine cycles in a manufacturing environment can be processed to do value engineering or economical modeling for new ventures.  There are as many uses for processed Big Data as there are business endeavors.  The companies that adopt the knowledge from Big Data will have a tactical edge over those who don't.

So what are some of the elements of the commerce side of Big Data?  Someone will make a ton of money with classification algorithms.  Other quants will come up with algorithms to value it, and it will remain the last bastion of true arbitrage, because one man's scrap is another man's gold.  Some techno-freaks will invent classifiers built into edge databases or database engine structures designed for real time intelligence.  There is a universe of possibilities.

There are so many possible money-making opportunities with Big Data, that derivatives will become standard instruments of trade and money-making.  A white paper is currently being authored.  To reserve your copy please send an email to DataPrivacy@mail.com.