All Things Techie With Huge, Unstructured, Intuitive Leaps
Showing posts with label standard. Show all posts
Showing posts with label standard. Show all posts

Standard Artificial Neural Network Template



For the past few weeks, I have been thinking about having a trained artificial neural network on a computer, transferring it to another computer or another operating system, or even selling the trained network in a digital exchange in the future.

It really doesn't matter in what programming language artificial neural networks are written in.  They all have the same parameters, inputs, outputs, weights, biases etc.  All of these values are particularly suited to be fed into the program using XML document based on an .XSD schema or a light-weight protocol like JSON.  However, to my knowledge, this hasn't been done, so I took it upon myself to crack one out.

It is not only useful in creating portability in a non-trained network, but it also has data elements for a trained network as well, making the results of deep learning, machine learning and AI training portable and available.

Even if there are existing binaries, creating a front end to input the values would take minimal programming, re-programming or updating.

I also took the opportunity to make it extensible and flexible. Also there are elements that are not yet developed (like an XML XSD tag for a function) but I put the capability in, once it is developed.

There are a few other interesting things included.  There is the option to define more than one activation function. The values for the local gradient, the alpha and other parameters are included for further back propagation.

There is room to include a link to the original dataset to which these nets were trained (it could be a URL, a directory pathway, a database URL etc).  There is an element to record the number of training epochs.  With all of these information, the artificial neural net can be re-created from scratch.

There is extensibility in case this network is chained to another. There is the added data dimension in case other type of neurons are invented such as accumulators, or neurons that return a probability.

I put this .xsd template on Github as a public repository. You can download it from here:

http://github.com/kenbodnar/ann_template

Or if you wish, here is the contents of the .xsd called ann.xsd.  It is heavily commented for convenience.


<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
  <xs:element name="artificial_neural_network">
    <xs:complexType>
      <xs:sequence>
        <!-- The "name" element is the name of the network. They should have friendly names that can be referred to if it ever goes up for sale, rent, swap, donate, or promulgate.-->
        <xs:element name="name" type="xs:string" minOccurs="1" maxOccurs="1"/>
        <!-- The "id" element is optional and can be the pkid if the values of this network are stored in an SQL (or NOSQL) database, to be called out and assembled into a network on an ad hoc basis-->
        <xs:element name="id" type="xs:integer" minOccurs="0" maxOccurs="1"/>
        <!-- The "revision" element is for configuration control-->
        <xs:element name="revision" type="xs:string" minOccurs="1" maxOccurs="1"/>
        <!-- The "revision_history" is optional and is an element to describe changes to the network -->
        <xs:element name="revision_history" type="xs:string" minOccurs="0" maxOccurs="1"/>
        <!-- The "classification element" is put in for later use. Someone will come up with a classification algorithm for types of neural nets.There is room for a multiplicity of classifications-->
        <xs:element name="classification" type="xs:string" minOccurs="0" maxOccurs="0"/>
        <!-- The "region" element is optional and will be important if the networks are chained together, and the neurons have different functions than a standard neuron, like an accumulator or a probability computer
        and are grouped by region, disk, server, cloud, partition, etc-->
        <xs:element name="region" type="xs:string" minOccurs="0" maxOccurs="1"/>
        <!-- The "description" element is an optional field, however a very useful one.-->
        <xs:element name="description" type="xs:string" minOccurs="0" maxOccurs="1"/>
        <!-- The "creator" element is optional and denotes who trained these nets -->
        <xs:element name="creator" type="xs:string" minOccurs="0" maxOccurs="1"/>
        <!-- The "notes" element is optional and is self explanatory-->
 <xs:element name="notes" type="xs:string" minOccurs="0" maxOccurs="1"/>
        <!-- The source element defines the origin of the data. It could be a URL -->
 <xs:element name="dataset_source" type="xs:string" minOccurs="0" maxOccurs="1"/>
        <!-- This optional element, together with the source data helps to recreate this network should it go wonky -->
        <xs:element name="number_of_training_epochs" type="xs:integer" minOccurs="0" maxOccurs="1"/>
        <!-- The "number_of_layers" defines the total-->
        <xs:element name="number_of_layers" type="xs:integer" minOccurs="1" maxOccurs="1"/>
        <xs:element name="layers">
          <xs:complexType>
            <xs:sequence>
              <!-- Repeat as necessary for number of layers-->
              <xs:element name="layer" type="xs:string" minOccurs="1" maxOccurs="1">
                <xs:complexType>
                  <xs:sequence>
                    <!-- Layer Naming and Neuron Naming will ultimately have a recognized convention eg. L2-N1 is Layer 2, Neuron #1-->
                    <xs:element name="layer_name" type="xs:string" minOccurs="0" maxOccurs="1"/>
                    <!-- number of neurons is for the benefit of an object-oriented constructor-->
                    <xs:element name="number_of_neurons" type="xs:integer" minOccurs="1" maxOccurs="1"/>
                    <!-- defining the neuron this is repeated as many times as necessary-->
                    <xs:element name="neuron">
                      <xs:complexType>
                        <xs:sequence>
                          <!--optional ~  currently it could be a perceptron, but it could also be a new type, like an accumulator, or probability calculator-->
                          <xs:element name="type" type="xs:string" minOccurs="0" maxOccurs="1"/>
                          <!-- name is optional ~ name will be standardized eg. L1-N1 layer/neuron pair. The reason is that there might be benefit in synaptic joining of this layer to other networks and one must define the joins -->
                          <xs:element name="name" type="xs:string" minOccurs="0" maxOccurs="1"/>
                          <!-- optional ~ again, someone will come up with a classification system-->
                          <xs:element name="neuron_classification" type="xs:string" minOccurs="0" maxOccurs="1"/>
                          <!-- number of inputs-->
                          <xs:element name="number_of_inputs" type="xs:integer" minOccurs="1" maxOccurs="1"/>
                          <!-- required if the input layer is also an output layer - eg. sigmoid, heaviside etc-->
                          <xs:element name="primary_activation_function_name" type="xs:string" minOccurs="0" maxOccurs="1"/>
                          <!-- ~ optional - there is no such thing as a xs:function - yet, but there could be in the future -->
                          <xs:element name="primary_activation_function" type="xs:function" minOccurs="0" maxOccurs="1"/>
                          <!-- in lieu of an embeddable function, a description could go here ~ optional -->
                          <xs:element name="primary_activation_function_description" type="xs:string" minOccurs="0" maxOccurs="1"/>
                          <!-- possible alternate activation functions eg. sigmoid, heaviside etc-->
                          <xs:element name="alternate_activation_function_name" type="xs:string" minOccurs="0" maxOccurs="1"/>
                          <!-- ~ optional - there is no such thing as a xs:function - yet, but there could be in the future -->
                          <xs:element name="alternate__activation_function" type="xs:function" minOccurs="0" maxOccurs="1"/>
                          <!-- in lieu of an embeddable function, a description could go here ~ optional -->
                          <xs:element name="alternate__activation_function_description" type="xs:string" minOccurs="0" maxOccurs="1"/>
                          <!-- if this is an output layer or requires an activation threshold-->
                          <xs:element name="activation_threshold" type="xs:double" minOccurs="1" maxOccurs="1"/>
                          <xs:element name="learning_rate" type="xs:double" minOccurs="1" maxOccurs="1"/>
                          <!-- the alpha or the 'movement' is used in the back propagation formula to calculate new weights-->
                          <xs:element name="alpha" type="xs:double" minOccurs="1" maxOccurs="1"/>
                          <!-- the local gradient is used in back propagation-->
                          <xs:element name="local_gradient" type="xs:double" minOccurs="1" maxOccurs="1"/>
                          <!-- inputs as many as needed-->
                          <xs:element name="input">
                            <xs:complexType>
                              <xs:sequence>
                                <!-- Inputs optionally named in case order is necessary for definition -->
                                <xs:element name="input_name" type="xs:string" minOccurs="0" maxOccurs="1"/>
                                <!-- use appropriate type-->
                                <xs:element name="input_value_double" type="xs:double" minOccurs="0" maxOccurs="unbounded"/>
                                <!-- use appropriate type-->
                                <xs:element name="input_value_integer" type="xs:integer" minOccurs="0" maxOccurs="unbounded"/>
                                <!-- weight for this input-->
                                <xs:element name="input_value_weight" type="xs:double" minOccurs="1" maxOccurs="1"/>
                                <!-- added as a convenince for continuation of back propagation if the network is relocated, moved, cloned, etc-->
                                <xs:element name="input_value_previous_weight" type="xs:double" minOccurs="1" maxOccurs="1"/>
                              </xs:sequence>
                            </xs:complexType>
                          </xs:element>
                          <!-- end of input-->
                          <!-- bias start-->
                          <xs:element name="bias">
                            <xs:complexType>
                              <xs:sequence>
                                <xs:element name="bias_value" type="xs:double" minOccurs="1" maxOccurs="1"/>
                                <xs:element name="bias_value_weight" type="xs:double" minOccurs="1" maxOccurs="1"/>
                                <!-- added as a convenince for continuation of back propagation if the network is relocated, moved, cloned, etc-->
                                <xs:element name="bias_value_previous_weight" type="xs:double" minOccurs="1" maxOccurs="1"/>
                              </xs:sequence>
                            </xs:complexType>
                          </xs:element>
                          <!-- end of bias-->
                          <xs:element name="output">
                            <xs:complexType>
                              <xs:sequence>
                                <!-- outputs optionally named in case order is necessary for definition -->
                                <xs:element name="output_name" type="xs:string" minOccurs="0" maxOccurs="1"/>
                                <xs:element name="output_value_double" type="xs:double" minOccurs="0" maxOccurs="unbounded"/>
                                <!-- hypothetical value is a description of what it means if the neuron activates and fires as output if this is the last layer-->
                                <xs:element name="hypothetical_value" type="xs:string" minOccurs="0" maxOccurs="unbounded"/>
                              </xs:sequence>
                            </xs:complexType>
                          </xs:element>
                          <!-- end of output-->
                        </xs:sequence>
                      </xs:complexType>
                    </xs:element>
                    <!-- end of neuron-->
                  </xs:sequence>
                </xs:complexType>
              </xs:element>
              <!-- end of layer-->
            </xs:sequence>
          </xs:complexType>
        </xs:element>
        <!-- end of layers-->
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <!-- network-->
</xs:schema>

I hope this helps someone. This is open source. Please use it and pass it on if you find it useful.

Perils of Overtraining in AI Deep Learning


When we partnered with a local university department of Computer Science to create some Artificial Neural Networks (ANNs) for our platform, we gave them several years of data to play with.  They massaged the input data, created an ANN machine and ran training epochs to kingdom come.

 The trouble with ANNs, is that you can over-train them.  This means that they respond  specifically for the data set in a highly accurate manner, but they are not general enough to accurately process new data.  To put it in general terms, their point-of-view is too narrow, and encompasses only the data that they were trained on.

In the training process, I was intuitively guessing that the learning rate and improved accuracy would improve in an exponential manner with each iterative training epoch.  I was wrong.  Here is a graph showing that the learning rate is rather linear than exponential in the training cycle.


So the minute that the graph stops being linear, is when you stop training.  However, as our university friends found out, they had no way to regress the machine to exactly one training epoch back.  They had no record of the weights, biases, adjusted weights, etc of the epoch after the hours of back propagation or learning, and as a result, they had to re-run all of the training.

Me, I had a rather primitive way of saving the states of the neurons and layers. I mentioned it before. I wrote my machine in Java using object oriented programming, and those objects have the ability to be serialized.  In other words, binary objects in memory can be preserved in a state, written to disk, and then resurrected to be active in the last state that they were in.  Kind of like freezing a body cryogenically, but having the ability to bring it back to life.

So after every training epoch, I serialize the machine.  If I over-train the neural nets, I can get a signal by examining and/or plotting the error rates which are inverse to the accuracy of the nets. In the above graph, once the function stops being linear, I know that I am approaching the over-training event horizon.  Then I can regress with my save serialized versions of the AI machine.

Then the Eureka moment struck me! I had discovered a quick and easy cure for over-training.

I had in a previous blog article, a few down from here (or http://coderzen.blogspot.com/2015/01/brain-cells-for-sale-need-for.html ) I made the case for a standardized AI machine where you could have an XML or JSON lightweight representation of the layers, inputs, number of neurons, outputs and even hypothetical value mappings for the outputs, and then you wouldn't need to serialize the whole machine.  At the end of every training epoch, you just output the recipe for the layers, weights, biases etc, and you could revert to an earlier training incarnation by inputting a new XML file or a JSON object.

It's really time to draw up the .XSD schema for the standardized neuron. I want it to be open source. It would be horrible to be famous for thinking of a having a standardized neural net. Besides, being famous is just a job.

Smart Road JSON or XML Template For Messaging and Data Transmission and Receiving



With the Internet of Everything here among us, and GPS is ubiquitous in vehicles, and cars will have full internet capability, it will not be long before we have Smart Roads.  Each road will have an IP address.

I began wondering what the data package would look like for Smart Roads, and then I was struck with the axiom that the best way to predict the future is to invent it.

So without further ado, I cracked together and XML file of what the Smart Road dataset would look like.  Here is my first crack at creating a Smart Road Data Standard:

<?xml version="1.0" encoding="UTF-8"?>
<!-- Smart Road Markup Language version 1.0-->

<smart_road>  
<!-- Main Element.  It consists of a header, data and trailer elements. -->


    <header>
      <!-- Header Element.  -->

      <ref_number></ref_number> <!-- This could be a database Primary Key Identifier.  -->
      <country></country>
      <state></state>
      <province></province>
      <county></county>
      <township></township>
      <ip6_address></ip6_address> <!-- Each Smart Road will have their own IP address.  -->
      <start_gps></start_gps>
      <classification></classification> <!-- There could be subclassifications like carriageways etc.  -->
      <end_gps></end_gps>
      <length_kms></length_kms>
      <maximum_speed_limit></maximum_speed_limit>
      <options></options> <!-- This could be a control element for suppressing some header info on subsequent exchanges.  -->

     </header>

  <data> <!-- Message Section.  -->
  <data_sent>
     <general_data> <!-- This section contains general data sent to the vehicles  -->
            <alerts>
                   <current_alerts></current_alerts>
                   <upcoming_alerts></upcoming_alerts>
             </alerts>
            <flags></flags>
           <messages>
              <alerts></alerts>
              <construction></construction>
              <law_enforcement></law_enforcement>
              <traffic></traffic>
              <weather></weather>
             <misc></misc>
            <user_specific>
                 <destination_address></destination_address>
                 <ack_flag> </ack_flag>
                <message_payload></message_payload>
                <delivery_receipt></delivery_receipt>
            </user_specific>
         </messages>
    </general_data> 
      <location_specific_data><!-- This section contains location-specific sent to the vehicles  -->
          <current_gps_marker>
               <current_maximum_allowable_speed></current_maximum_allowable_speed>
                <alerts>
                      <current_alerts></current_alerts>
                      <upcoming_alerts></upcoming_alerts>
                </alerts>
                <flags></flags>
               <messages>
               <alerts></alerts>
               <construction></construction>
               <law_enforcement></law_enforcement>
               <traffic></traffic>
               <weather></weather>
              <misc></misc>
               <user_specific><!-- It is anticipated that if a vehicle's onboard messaging is not working, one can billboard messages -->
               <!-- This can also be used to send violation notices and service-related messages to the vehicle -->
                     <destination_address></destination_address>
                     <ack_flag> </ack_flag>
                    <message_payload></message_payload>
                    <delivery_receipt></delivery_receipt>
               </user_specific>
              </messages>
       </current_gps_marker>
      </location_specific_data>
   </data_sent>
   <data_received>
        <vehicle>
           <type></type>
           <description></description>
           <direction></direction>
           <velocity></velocity>
           <timestamp></timestamp>
        </vehicle>
   </data_received>
  </data>

  <meta-data>
  <!-- Big Data meta data on usage stats etc.-->
  </meta-data>
  <trailer>
    <protocols_supported> </protocols_supported><!-- Various devices will have their own native Smart Road protocols -->
    <device_types></device_types>
    <end></end>
    
  </trailer>
</smart_road>

This is just a first iteration.  It needs to be tested and validated in real time.  It is anticipated that the vehicle will send its GPS coordinates to the database defined by the IP address, and the location specific data will be returned.

A Standard For Twitter Hashtags

I follow Bath Rugby players on Twitter, among other things. I notice that some of the team are avid users of Twitter. They are also quite inventive with hashtags. Hashtags are much more than search tools. They can be cleverly used to create innuendo, a wry comment, a joke, or a commentary all under the guise of just being a hashtag.

However, I do propose a standard for hashtags. It is quite simple, and one that we use in computer programming for variable names. The standard is this: Every time that you come to a new world, use a capital letter. It vastly enhanced the readability. It could also change the meaning:

#psychotherapist

or

#PsychoTheRapist

So if everyone would adopt this readability standard for Twitter hashtags, the world would become a slightly less confusing place, and we would be doing our part to fight chaos and entropy.