Future Imperfect & Software Stream of Consciousness : standard

Showing posts with label standard. Show all posts

Standard Artificial Neural Network Template

For the past few weeks, I have been thinking about having a trained artificial neural network on a computer, transferring it to another computer or another operating system, or even selling the trained network in a digital exchange in the future.

It really doesn't matter in what programming language artificial neural networks are written in. They all have the same parameters, inputs, outputs, weights, biases etc. All of these values are particularly suited to be fed into the program using XML document based on an .XSD schema or a light-weight protocol like JSON. However, to my knowledge, this hasn't been done, so I took it upon myself to crack one out.

It is not only useful in creating portability in a non-trained network, but it also has data elements for a trained network as well, making the results of deep learning, machine learning and AI training portable and available.

Even if there are existing binaries, creating a front end to input the values would take minimal programming, re-programming or updating.

I also took the opportunity to make it extensible and flexible. Also there are elements that are not yet developed (like an XML XSD tag for a function) but I put the capability in, once it is developed.

There are a few other interesting things included. There is the option to define more than one activation function. The values for the local gradient, the alpha and other parameters are included for further back propagation.

There is room to include a link to the original dataset to which these nets were trained (it could be a URL, a directory pathway, a database URL etc). There is an element to record the number of training epochs. With all of these information, the artificial neural net can be re-created from scratch.

There is extensibility in case this network is chained to another. There is the added data dimension in case other type of neurons are invented such as accumulators, or neurons that return a probability.

I put this .xsd template on Github as a public repository. You can download it from here:

http://github.com/kenbodnar/ann_template

Or if you wish, here is the contents of the .xsd called ann.xsd. It is heavily commented for convenience.

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
<xs:element name="artificial_neural_network">
<xs:complexType>
<xs:sequence>

<xs:element name="name" type="xs:string" minOccurs="1" maxOccurs="1"/>

<xs:element name="id" type="xs:integer" minOccurs="0" maxOccurs="1"/>

<xs:element name="revision" type="xs:string" minOccurs="1" maxOccurs="1"/>

<xs:element name="revision_history" type="xs:string" minOccurs="0" maxOccurs="1"/>

<xs:element name="classification" type="xs:string" minOccurs="0" maxOccurs="0"/>

<xs:element name="region" type="xs:string" minOccurs="0" maxOccurs="1"/>

<xs:element name="description" type="xs:string" minOccurs="0" maxOccurs="1"/>

<xs:element name="creator" type="xs:string" minOccurs="0" maxOccurs="1"/>

<xs:element name="notes" type="xs:string" minOccurs="0" maxOccurs="1"/>

<xs:element name="dataset_source" type="xs:string" minOccurs="0" maxOccurs="1"/>

<xs:element name="number_of_training_epochs" type="xs:integer" minOccurs="0" maxOccurs="1"/>

<xs:element name="number_of_layers" type="xs:integer" minOccurs="1" maxOccurs="1"/>
<xs:element name="layers">
<xs:complexType>
<xs:sequence>

<xs:element name="layer" type="xs:string" minOccurs="1" maxOccurs="1">
<xs:complexType>
<xs:sequence>

<xs:element name="layer_name" type="xs:string" minOccurs="0" maxOccurs="1"/>

<xs:element name="number_of_neurons" type="xs:integer" minOccurs="1" maxOccurs="1"/>

<xs:element name="neuron">
<xs:complexType>
<xs:sequence>

<xs:element name="type" type="xs:string" minOccurs="0" maxOccurs="1"/>

<xs:element name="name" type="xs:string" minOccurs="0" maxOccurs="1"/>

<xs:element name="neuron_classification" type="xs:string" minOccurs="0" maxOccurs="1"/>

<xs:element name="number_of_inputs" type="xs:integer" minOccurs="1" maxOccurs="1"/>

<xs:element name="primary_activation_function_name" type="xs:string" minOccurs="0" maxOccurs="1"/>

<xs:element name="primary_activation_function" type="xs:function" minOccurs="0" maxOccurs="1"/>

<xs:element name="primary_activation_function_description" type="xs:string" minOccurs="0" maxOccurs="1"/>

<xs:element name="alternate_activation_function_name" type="xs:string" minOccurs="0" maxOccurs="1"/>

<xs:element name="alternate__activation_function" type="xs:function" minOccurs="0" maxOccurs="1"/>

<xs:element name="alternate__activation_function_description" type="xs:string" minOccurs="0" maxOccurs="1"/>

<xs:element name="activation_threshold" type="xs:double" minOccurs="1" maxOccurs="1"/>
<xs:element name="learning_rate" type="xs:double" minOccurs="1" maxOccurs="1"/>

<xs:element name="alpha" type="xs:double" minOccurs="1" maxOccurs="1"/>

<xs:element name="local_gradient" type="xs:double" minOccurs="1" maxOccurs="1"/>

<xs:element name="input">
<xs:complexType>
<xs:sequence>

<xs:element name="input_name" type="xs:string" minOccurs="0" maxOccurs="1"/>

<xs:element name="input_value_double" type="xs:double" minOccurs="0" maxOccurs="unbounded"/>

<xs:element name="input_value_integer" type="xs:integer" minOccurs="0" maxOccurs="unbounded"/>

<xs:element name="input_value_weight" type="xs:double" minOccurs="1" maxOccurs="1"/>

<xs:element name="input_value_previous_weight" type="xs:double" minOccurs="1" maxOccurs="1"/>
</xs:sequence>
</xs:complexType>
</xs:element>


<xs:element name="bias">
<xs:complexType>
<xs:sequence>
<xs:element name="bias_value" type="xs:double" minOccurs="1" maxOccurs="1"/>
<xs:element name="bias_value_weight" type="xs:double" minOccurs="1" maxOccurs="1"/>

<xs:element name="bias_value_previous_weight" type="xs:double" minOccurs="1" maxOccurs="1"/>
</xs:sequence>
</xs:complexType>
</xs:element>

<xs:element name="output">
<xs:complexType>
<xs:sequence>

<xs:element name="output_name" type="xs:string" minOccurs="0" maxOccurs="1"/>
<xs:element name="output_value_double" type="xs:double" minOccurs="0" maxOccurs="unbounded"/>

<xs:element name="hypothetical_value" type="xs:string" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
</xs:element>

</xs:sequence>
</xs:complexType>
</xs:element>

</xs:sequence>
</xs:complexType>
</xs:element>

</xs:sequence>
</xs:complexType>
</xs:element>

</xs:sequence>
</xs:complexType>
</xs:element>

</xs:schema>

I hope this helps someone. This is open source. Please use it and pass it on if you find it useful.

Perils of Overtraining in AI Deep Learning

When we partnered with a local university department of Computer Science to create some Artificial Neural Networks (ANNs) for our platform, we gave them several years of data to play with. They massaged the input data, created an ANN machine and ran training epochs to kingdom come.

The trouble with ANNs, is that you can over-train them. This means that they respond specifically for the data set in a highly accurate manner, but they are not general enough to accurately process new data. To put it in general terms, their point-of-view is too narrow, and encompasses only the data that they were trained on.

In the training process, I was intuitively guessing that the learning rate and improved accuracy would improve in an exponential manner with each iterative training epoch. I was wrong. Here is a graph showing that the learning rate is rather linear than exponential in the training cycle.

So the minute that the graph stops being linear, is when you stop training. However, as our university friends found out, they had no way to regress the machine to exactly one training epoch back. They had no record of the weights, biases, adjusted weights, etc of the epoch after the hours of back propagation or learning, and as a result, they had to re-run all of the training.

Me, I had a rather primitive way of saving the states of the neurons and layers. I mentioned it before. I wrote my machine in Java using object oriented programming, and those objects have the ability to be serialized. In other words, binary objects in memory can be preserved in a state, written to disk, and then resurrected to be active in the last state that they were in. Kind of like freezing a body cryogenically, but having the ability to bring it back to life.

So after every training epoch, I serialize the machine. If I over-train the neural nets, I can get a signal by examining and/or plotting the error rates which are inverse to the accuracy of the nets. In the above graph, once the function stops being linear, I know that I am approaching the over-training event horizon. Then I can regress with my save serialized versions of the AI machine.

Then the Eureka moment struck me! I had discovered a quick and easy cure for over-training.

I had in a previous blog article, a few down from here (or http://coderzen.blogspot.com/2015/01/brain-cells-for-sale-need-for.html ) I made the case for a standardized AI machine where you could have an XML or JSON lightweight representation of the layers, inputs, number of neurons, outputs and even hypothetical value mappings for the outputs, and then you wouldn't need to serialize the whole machine. At the end of every training epoch, you just output the recipe for the layers, weights, biases etc, and you could revert to an earlier training incarnation by inputting a new XML file or a JSON object.

It's really time to draw up the .XSD schema for the standardized neuron. I want it to be open source. It would be horrible to be famous for thinking of a having a standardized neural net. Besides, being famous is just a job.

Smart Road JSON or XML Template For Messaging and Data Transmission and Receiving

With the Internet of Everything here among us, and GPS is ubiquitous in vehicles, and cars will have full internet capability, it will not be long before we have Smart Roads. Each road will have an IP address.

I began wondering what the data package would look like for Smart Roads, and then I was struck with the axiom that the best way to predict the future is to invent it.

So without further ado, I cracked together and XML file of what the Smart Road dataset would look like. Here is my first crack at creating a Smart Road Data Standard:

<?xml version="1.0" encoding="UTF-8"?>


<smart_road>


<header>


<ref_number></ref_number> 
<country></country>
<state></state>
<province></province>
<county></county>
<township></township>
<ip6_address></ip6_address> 
<start_gps></start_gps>
<classification></classification> 
<end_gps></end_gps>
<length_kms></length_kms>
<maximum_speed_limit></maximum_speed_limit>
<options></options> 

</header>

<data> 
<data_sent>
<general_data> 
<alerts>
<current_alerts></current_alerts>
<upcoming_alerts></upcoming_alerts>
</alerts>
<flags></flags>
<messages>
<alerts></alerts>
<construction></construction>
<law_enforcement></law_enforcement>
<traffic></traffic>
<weather></weather>
<misc></misc>
<user_specific>
<destination_address></destination_address>
<ack_flag> </ack_flag>
<message_payload></message_payload>
<delivery_receipt></delivery_receipt>
</user_specific>
</messages>
</general_data>
<location_specific_data>
<current_gps_marker>
<current_maximum_allowable_speed></current_maximum_allowable_speed>
<alerts>
<current_alerts></current_alerts>
<upcoming_alerts></upcoming_alerts>
</alerts>
<flags></flags>
<messages>
<alerts></alerts>
<construction></construction>
<law_enforcement></law_enforcement>
<traffic></traffic>
<weather></weather>
<misc></misc>
<user_specific>

<destination_address></destination_address>
<ack_flag> </ack_flag>
<message_payload></message_payload>
<delivery_receipt></delivery_receipt>
</user_specific>
</messages>
</current_gps_marker>
</location_specific_data>
</data_sent>
<data_received>
<vehicle>
<type></type>
<description></description>
<direction></direction>
<velocity></velocity>
<timestamp></timestamp>
</vehicle>
</data_received>
</data>

<meta-data>

</meta-data>
<trailer>
<protocols_supported> </protocols_supported>
<device_types></device_types>
<end></end>

</trailer>
</smart_road>

This is just a first iteration. It needs to be tested and validated in real time. It is anticipated that the vehicle will send its GPS coordinates to the database defined by the IP address, and the location specific data will be returned.

A Standard For Twitter Hashtags

I follow Bath Rugby players on Twitter, among other things. I notice that some of the team are avid users of Twitter. They are also quite inventive with hashtags. Hashtags are much more than search tools. They can be cleverly used to create innuendo, a wry comment, a joke, or a commentary all under the guise of just being a hashtag.

However, I do propose a standard for hashtags. It is quite simple, and one that we use in computer programming for variable names. The standard is this: Every time that you come to a new world, use a capital letter. It vastly enhanced the readability. It could also change the meaning:

#psychotherapist

#PsychoTheRapist

So if everyone would adopt this readability standard for Twitter hashtags, the world would become a slightly less confusing place, and we would be doing our part to fight chaos and entropy.