Victor MARTINhttps://vmartin.fr/2023-02-17T00:00:00+01:00Understanding Automatic Differentiation in 30 lines of Python2023-02-17T00:00:00+01:002023-02-17T00:00:00+01:00Victor MARTINtag:vmartin.fr,2023-02-17:/understanding-automatic-differentiation-in-30-lines-of-python.html<!--
Améliorer l'article sur sympy jacobian pour parler du fait que scipy fait l'estimation de la dérivée alors qu'on peut des fois la calculer sans problème avec sympy.
Au delà des réseaux de neurones, beaucoup d'algorithmes d'optimisation demande le calcul du gradient, c'est à dire du vecteur qui va contenir toutes les dérivées partielles. La fonction `scipy.optimize.optimize` de la bibliothèque Numpy/Scipy va estimer le gradient en un point $x$ avec la formule suivante :
$$\lim_{h\to 0} \frac{f(x+h) - f(x)}{h} $$
On peut obtenir un résultat numérique de la dérivée de $f$ au point $x$, mais ce sera une approximation, et il faudra appeler deux fois la fonction $f$, ce qui peut être coûteux. -->
<p>---> <a href="/understanding-automatic-differentiation-in-30-lines-of-python-fr.html">For French version of this article, click here</a></p>
<p>I'm a Machine Learning engineer and I use libraries like Tensorflow and Pytorch in my work to train my neural networks. And it's been a while since I wanted to write the simplest piece of code to perform what is called <a href="https://en.wikipedia.org/wiki/Automatic_differentiation">automatic …</a></p><!--
Améliorer l'article sur sympy jacobian pour parler du fait que scipy fait l'estimation de la dérivée alors qu'on peut des fois la calculer sans problème avec sympy.
Au delà des réseaux de neurones, beaucoup d'algorithmes d'optimisation demande le calcul du gradient, c'est à dire du vecteur qui va contenir toutes les dérivées partielles. La fonction `scipy.optimize.optimize` de la bibliothèque Numpy/Scipy va estimer le gradient en un point $x$ avec la formule suivante :
$$\lim_{h\to 0} \frac{f(x+h) - f(x)}{h} $$
On peut obtenir un résultat numérique de la dérivée de $f$ au point $x$, mais ce sera une approximation, et il faudra appeler deux fois la fonction $f$, ce qui peut être coûteux. -->
<p>---> <a href="/understanding-automatic-differentiation-in-30-lines-of-python-fr.html">For French version of this article, click here</a></p>
<p>I'm a Machine Learning engineer and I use libraries like Tensorflow and Pytorch in my work to train my neural networks. And it's been a while since I wanted to write the simplest piece of code to perform what is called <a href="https://en.wikipedia.org/wiki/Automatic_differentiation">automatic differentiation</a> which is at the heart of neural network training.<br>
In this article, I will try to iteratively build the simplest code to calculate derivatives automatically on scalars.</p>
<p>In the following Python code, the sum between <code>x</code> and <code>y</code> will be performed and the result (<code>8</code>) will be assigned to the variable <code>z</code>. After the assignment, the <code>z</code> variable keeps no track of the variables used, so there is no way to automatically update the value of <code>z</code> if the value of <code>x</code> or <code>y</code> changes. Even less possible to understand the link between each variable to calculate a derivative automatically.</p>
<div class="highlight"><pre><span></span><code><span class="n">x</span> <span class="o">=</span> <span class="mi">3</span>
<span class="n">y</span> <span class="o">=</span> <span class="mi">5</span>
<span class="n">z</span> <span class="o">=</span> <span class="n">x</span> <span class="o">+</span> <span class="n">y</span>
</code></pre></div>
<h2>The <code>Tensor</code> class</h2>
<p>The idea is to create a new type, a <code>Tensor</code>, which will allow us to do <strong>symbolic calculation</strong> on our variables.<br>
Let's start by creating a <code>Tensor</code> class where the addition operation is redefined.</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
<span class="k">class</span> <span class="nc">Tensor</span><span class="p">:</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">value</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">value</span> <span class="o">=</span> <span class="n">value</span>
<span class="k">def</span> <span class="fm">__repr__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="sa">f</span><span class="s2">"T:</span><span class="si">{</span><span class="bp">self</span><span class="o">.</span><span class="n">value</span><span class="si">}</span><span class="s2">"</span>
<span class="k">def</span> <span class="fm">__add__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">):</span>
<span class="n">t</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="n">value</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">value</span> <span class="o">+</span> <span class="n">other</span><span class="o">.</span><span class="n">value</span><span class="p">)</span>
<span class="k">return</span> <span class="n">t</span>
<span class="n">x</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="mi">5</span><span class="p">)</span>
<span class="n">z</span> <span class="o">=</span> <span class="n">x</span> <span class="o">+</span> <span class="n">y</span>
<span class="nb">print</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">z</span><span class="p">)</span>
<span class="c1"># Out:</span>
<span class="c1"># T:3 T:5</span>
<span class="c1"># T:8</span>
</code></pre></div>
<p>In this example, we create a <code>Tensor</code> class that can store a value, and we redefine addition to create a new <code>Tensor</code> when we perform an addition between two <code>Tensors</code>. There is no symbolic computation mechanism yet that will allow <code>z</code> to have a trace that it is the result of the addition between <code>x</code> and <code>y</code>.<br>
We will add this behavior by using <strong>a binary tree</strong>. Each tensor will be able to contain the other two tensors and the operation that produced it. To do this, we introduce the Children tuple which will contain these three pieces of information.</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
<span class="kn">from</span> <span class="nn">collections</span> <span class="kn">import</span> <span class="n">namedtuple</span>
<span class="n">Children</span> <span class="o">=</span> <span class="n">namedtuple</span><span class="p">(</span><span class="s1">'Children'</span><span class="p">,</span> <span class="p">[</span><span class="s1">'a'</span><span class="p">,</span> <span class="s1">'b'</span><span class="p">,</span> <span class="s1">'op'</span><span class="p">])</span>
<span class="k">class</span> <span class="nc">Tensor</span><span class="p">:</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">value</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">children</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">value</span> <span class="o">=</span> <span class="n">value</span>
<span class="bp">self</span><span class="o">.</span><span class="n">children</span> <span class="o">=</span> <span class="n">children</span>
<span class="k">def</span> <span class="nf">forward</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="k">return</span> <span class="bp">self</span>
<span class="c1"># compute forward pass of children in the tree</span>
<span class="n">a</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">a</span><span class="o">.</span><span class="n">forward</span><span class="p">()</span>
<span class="n">b</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span><span class="o">.</span><span class="n">forward</span><span class="p">()</span>
<span class="c1"># If values are set, let's compute the real value of this tensor</span>
<span class="k">if</span> <span class="n">a</span><span class="o">.</span><span class="n">value</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span> <span class="ow">and</span> <span class="n">b</span><span class="o">.</span><span class="n">value</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">value</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">op</span><span class="p">(</span><span class="n">a</span><span class="o">.</span><span class="n">value</span><span class="p">,</span> <span class="n">b</span><span class="o">.</span><span class="n">value</span><span class="p">)</span>
<span class="k">return</span> <span class="bp">self</span>
<span class="k">def</span> <span class="fm">__repr__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="sa">f</span><span class="s2">"T:</span><span class="si">{</span><span class="bp">self</span><span class="o">.</span><span class="n">value</span><span class="si">}</span><span class="s2">"</span>
<span class="k">def</span> <span class="fm">__add__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">):</span>
<span class="n">c</span> <span class="o">=</span> <span class="n">Children</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">add</span><span class="p">)</span>
<span class="n">t</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="n">children</span><span class="o">=</span><span class="n">c</span><span class="p">)</span>
<span class="k">return</span> <span class="n">t</span><span class="o">.</span><span class="n">forward</span><span class="p">()</span>
<span class="k">def</span> <span class="fm">__mul__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">):</span>
<span class="n">c</span> <span class="o">=</span> <span class="n">Children</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">multiply</span><span class="p">)</span>
<span class="n">t</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="n">children</span><span class="o">=</span><span class="n">c</span><span class="p">)</span>
<span class="k">return</span> <span class="n">t</span><span class="o">.</span><span class="n">forward</span><span class="p">()</span>
<span class="n">x</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="mi">5</span><span class="p">)</span>
<span class="n">z1</span> <span class="o">=</span> <span class="n">x</span> <span class="o">+</span> <span class="n">y</span>
<span class="n">z2</span> <span class="o">=</span> <span class="n">z1</span> <span class="o">*</span> <span class="n">y</span>
<span class="nb">print</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">z2</span><span class="p">)</span>
<span class="c1"># Out</span>
<span class="c1"># T:3 T:5</span>
<span class="c1"># T:40</span>
</code></pre></div>
<p>Now a tensor, in addition to containing a numerical value, will contain the tuple <code>children</code> allowing it to keep track of the calculation. In this example, in addition to introducing the <code>Children</code> type, we have added the multiplication method to tensors. The <code>forward</code> method to the <code>Tensor</code> class has also been added to be able to <strong>execute the computation graph</strong> and compute the actual value of the tensors. The tensor <code>z2</code> can be modeled by the following computation graph.</p>
<!-- ![Pelican](imgs/auto-diff/graph_xyzz1.png =100x20) -->
<p><img src="imgs/auto-diff/graph_xyz1z2.png" alt="Computation graph of the tensor z2" width="300" /></p>
<p>We can check that it works as expected by first creating the graph without specifying any values:</p>
<div class="highlight"><pre><span></span><code><span class="n">x</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="kc">None</span><span class="p">)</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="kc">None</span><span class="p">)</span>
<span class="n">z1</span> <span class="o">=</span> <span class="n">x</span> <span class="o">+</span> <span class="n">y</span>
<span class="n">z2</span> <span class="o">=</span> <span class="n">z1</span> <span class="o">*</span> <span class="n">y</span>
<span class="nb">print</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">z2</span><span class="p">)</span>
<span class="c1"># Out</span>
<span class="c1"># T:None T:None</span>
<span class="c1"># T:None</span>
</code></pre></div>
<p>Then the values of the leaves of the tree (<code>x</code> and <code>y</code>) can be changed and the value of <code>z2</code> calculated. The call to <code>z2.forward()</code> will cause the <code>forward</code> method of <code>z</code> and <code>y</code> to be called, and these calls will browse the graph to recursively calculate the value of <code>z2</code>. </p>
<div class="highlight"><pre><span></span><code><span class="n">x</span><span class="o">.</span><span class="n">value</span> <span class="o">=</span> <span class="mi">3</span>
<span class="n">y</span><span class="o">.</span><span class="n">value</span> <span class="o">=</span> <span class="mi">5</span>
<span class="nb">print</span><span class="p">(</span><span class="n">z2</span><span class="o">.</span><span class="n">forward</span><span class="p">())</span>
<span class="c1"># Out</span>
<span class="c1"># T:40</span>
</code></pre></div>
<p><br /></p>
<h2>Adding Automatic Differenciation</h2>
<p>To add automatic differentiation to an arbitrary computation graph, we simply add the derivative for the basic operations supported by our <code>Tensor</code> class. Recursive calls to the <code>grad</code> function will traverse the computation graph and decompose a complex function to be derived into a combination of simple functions.</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">grad</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">deriv_to</span><span class="p">):</span>
<span class="c1"># Derivative of a tensor with itself is 1</span>
<span class="k">if</span> <span class="bp">self</span> <span class="ow">is</span> <span class="n">deriv_to</span><span class="p">:</span>
<span class="k">return</span> <span class="n">Tensor</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
<span class="c1"># Derivative of a scalar with another tensor is 0</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="k">return</span> <span class="n">Tensor</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">op</span> <span class="ow">is</span> <span class="n">np</span><span class="o">.</span><span class="n">add</span><span class="p">:</span> <span class="c1"># (a + b)' = a' + b'</span>
<span class="n">t</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">a</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">deriv_to</span><span class="p">)</span> <span class="o">+</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">deriv_to</span><span class="p">)</span>
<span class="k">elif</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">op</span> <span class="ow">is</span> <span class="n">np</span><span class="o">.</span><span class="n">multiply</span><span class="p">:</span> <span class="c1"># (ab)' = a'b + ab'</span>
<span class="n">t</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">a</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">deriv_to</span><span class="p">)</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span> <span class="o">+</span> \
<span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">a</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">deriv_to</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">NotImplementedError</span><span class="p">(</span><span class="sa">f</span><span class="s2">"This op is not implemented. </span><span class="si">{</span><span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">op</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span>
<span class="k">return</span> <span class="n">t</span>
</code></pre></div>
<p>We can now derive <code>z2</code> as a function of the variable of our choice:</p>
<div class="highlight"><pre><span></span><code><span class="nb">print</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span>
<span class="n">g</span> <span class="o">=</span> <span class="n">z2</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">y</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">g</span><span class="p">)</span>
<span class="c1"># Out</span>
<span class="c1"># T:3 T:5</span>
<span class="c1"># T:13</span>
</code></pre></div>
<p>Here, <code>g</code> is not just a value, <strong>it is a new computational graph</strong> that represents the partial derivative of <code>z2</code> as a function of <code>y</code>. Since the value of <code>x</code> and <code>y</code> was defined when <code>grad</code> was called, the value of <code>g</code> could be calculated. The calculation graph of <code>g</code> can be represented by this diagram:</p>
<p><img src="imgs/auto-diff/graph_xyz1z2_grad.png" alt="Computation graph of the partial derivative of z2 in function of y" width="300" /></p>
<p>Literally $g = \frac{\partial z_2}{\partial y} = x + 2*y$, and when <code>x</code> and <code>y</code> are 3 and 5 respectively, then <code>g</code> is 13.</p>
<h3>Enabling the <code>Tensor</code> class to handle more complex formulas</h3>
<p>To be able to use more complex formulas, we will add more operations to the <code>Tensor</code> class. We add subtraction, division, exponential and negation (<code>-x</code>).<br>
Here is the <code>Tensor</code> class in its final form:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">Tensor</span><span class="p">:</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">value</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">children</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">children</span> <span class="o">=</span> <span class="n">children</span>
<span class="bp">self</span><span class="o">.</span><span class="n">value</span> <span class="o">=</span> <span class="n">value</span>
<span class="bp">self</span><span class="o">.</span><span class="n">name</span> <span class="o">=</span> <span class="n">name</span>
<span class="k">def</span> <span class="nf">forward</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="k">return</span> <span class="bp">self</span>
<span class="n">a</span> <span class="o">=</span> <span class="kc">None</span>
<span class="n">b</span> <span class="o">=</span> <span class="kc">None</span>
<span class="c1"># compute forward pass of children in the tree</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">a</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="n">a</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">a</span><span class="o">.</span><span class="n">forward</span><span class="p">()</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="n">b</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span><span class="o">.</span><span class="n">forward</span><span class="p">()</span>
<span class="c1"># If a has a specific value after forward pass</span>
<span class="k">if</span> <span class="n">a</span><span class="o">.</span><span class="n">value</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="c1"># If the operation does not need a term b (like exp(a) for example)</span>
<span class="c1"># Use only a</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">value</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">op</span><span class="p">(</span><span class="n">a</span><span class="o">.</span><span class="n">value</span><span class="p">)</span>
<span class="c1"># Else if op needs a second term b and his value is not None after forward pass</span>
<span class="k">elif</span> <span class="n">b</span><span class="o">.</span><span class="n">value</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">value</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">op</span><span class="p">(</span><span class="n">a</span><span class="o">.</span><span class="n">value</span><span class="p">,</span> <span class="n">b</span><span class="o">.</span><span class="n">value</span><span class="p">)</span>
<span class="k">return</span> <span class="bp">self</span>
<span class="k">def</span> <span class="nf">grad</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">deriv_to</span><span class="p">):</span>
<span class="c1"># Derivative of a tensor with itself is 1</span>
<span class="k">if</span> <span class="bp">self</span> <span class="ow">is</span> <span class="n">deriv_to</span><span class="p">:</span>
<span class="k">return</span> <span class="n">Tensor</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
<span class="c1"># Derivative of a scalar with another tensor is 0</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="k">return</span> <span class="n">Tensor</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">op</span> <span class="ow">is</span> <span class="n">np</span><span class="o">.</span><span class="n">add</span><span class="p">:</span> <span class="c1"># (a + b)' = a' + b'</span>
<span class="n">t</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">a</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">deriv_to</span><span class="p">)</span> <span class="o">+</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">deriv_to</span><span class="p">)</span>
<span class="k">elif</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">op</span> <span class="ow">is</span> <span class="n">np</span><span class="o">.</span><span class="n">subtract</span><span class="p">:</span> <span class="c1"># (a - b)' = a' - b'</span>
<span class="n">t</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">a</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">deriv_to</span><span class="p">)</span> <span class="o">-</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">deriv_to</span><span class="p">)</span>
<span class="k">elif</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">op</span> <span class="ow">is</span> <span class="n">np</span><span class="o">.</span><span class="n">multiply</span><span class="p">:</span> <span class="c1"># (ab)' = a'b + ab'</span>
<span class="n">t</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">a</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">deriv_to</span><span class="p">)</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span> <span class="o">+</span> \
<span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">a</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">deriv_to</span><span class="p">)</span>
<span class="k">elif</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">op</span> <span class="ow">is</span> <span class="n">np</span><span class="o">.</span><span class="n">divide</span><span class="p">:</span> <span class="c1"># (a/b)' = (a'b - ab') / b²</span>
<span class="n">t</span> <span class="o">=</span> <span class="p">(</span>
<span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">a</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">deriv_to</span><span class="p">)</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span> <span class="o">-</span> \
<span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">a</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">deriv_to</span><span class="p">)</span>
<span class="p">)</span> <span class="o">/</span> \
<span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span><span class="p">)</span>
<span class="k">elif</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">op</span> <span class="ow">is</span> <span class="n">np</span><span class="o">.</span><span class="n">exp</span><span class="p">:</span> <span class="c1"># exp(a)' = a'exp(a)</span>
<span class="n">t</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">a</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">deriv_to</span><span class="p">)</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">a</span><span class="o">.</span><span class="n">exp</span><span class="p">()</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">NotImplementedError</span><span class="p">(</span><span class="sa">f</span><span class="s2">"This op is not implemented. </span><span class="si">{</span><span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">op</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span>
<span class="k">return</span> <span class="n">t</span>
<span class="k">def</span> <span class="fm">__repr__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="sa">f</span><span class="s2">"T:</span><span class="si">{</span><span class="bp">self</span><span class="o">.</span><span class="n">value</span><span class="si">}</span><span class="s2">"</span>
<span class="k">def</span> <span class="fm">__add__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">):</span>
<span class="n">c</span> <span class="o">=</span> <span class="n">Children</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">add</span><span class="p">)</span>
<span class="n">t</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="n">children</span><span class="o">=</span><span class="n">c</span><span class="p">)</span>
<span class="k">return</span> <span class="n">t</span><span class="o">.</span><span class="n">forward</span><span class="p">()</span>
<span class="k">def</span> <span class="fm">__sub__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">):</span>
<span class="n">c</span> <span class="o">=</span> <span class="n">Children</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">subtract</span><span class="p">)</span>
<span class="n">t</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="n">children</span><span class="o">=</span><span class="n">c</span><span class="p">)</span>
<span class="k">return</span> <span class="n">t</span><span class="o">.</span><span class="n">forward</span><span class="p">()</span>
<span class="k">def</span> <span class="fm">__mul__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">):</span>
<span class="n">c</span> <span class="o">=</span> <span class="n">Children</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">multiply</span><span class="p">)</span>
<span class="n">t</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="n">children</span><span class="o">=</span><span class="n">c</span><span class="p">)</span>
<span class="k">return</span> <span class="n">t</span><span class="o">.</span><span class="n">forward</span><span class="p">()</span>
<span class="k">def</span> <span class="fm">__truediv__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">):</span>
<span class="n">c</span> <span class="o">=</span> <span class="n">Children</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">divide</span><span class="p">)</span>
<span class="n">t</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="n">children</span><span class="o">=</span><span class="n">c</span><span class="p">)</span>
<span class="k">return</span> <span class="n">t</span><span class="o">.</span><span class="n">forward</span><span class="p">()</span>
<span class="k">def</span> <span class="fm">__neg__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="n">c</span> <span class="o">=</span> <span class="n">Children</span><span class="p">(</span><span class="n">Tensor</span><span class="p">(</span><span class="n">value</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">zeros_like</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">value</span><span class="p">)),</span> <span class="bp">self</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">subtract</span><span class="p">)</span>
<span class="n">t</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="n">children</span><span class="o">=</span><span class="n">c</span><span class="p">)</span>
<span class="k">return</span> <span class="n">t</span><span class="o">.</span><span class="n">forward</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">exp</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="n">c</span> <span class="o">=</span> <span class="n">Children</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="kc">None</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">exp</span><span class="p">)</span>
<span class="n">t</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="n">children</span><span class="o">=</span><span class="n">c</span><span class="p">)</span>
<span class="k">return</span> <span class="n">t</span><span class="o">.</span><span class="n">forward</span><span class="p">()</span>
</code></pre></div>
<p>For each operation added to the <code>Tensor</code> class, the corresponding derivative has been included in the <code>grad</code> method. Also, we have modified <code>forward</code> to handle more cases, especially to handle operations that require only one term like exponential or negation. </p>
<p>Now let's create a more complex formula and derive it!<br>
Let's try to derive $z$ :</p>
<p>$$z = \frac{12 - (x * e^{y})}{45 + x * y * e^{-x}}$$
<!-- $$z = (12 - (x * exp(y))) / (45 + x * y * exp(-x) )$$ --></p>
<p>We just have to write this equation using our <code>Tensor</code> class: </p>
<div class="highlight"><pre><span></span><code><span class="n">x</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="mi">5</span><span class="p">)</span>
<span class="n">z</span> <span class="o">=</span> <span class="p">(</span><span class="n">Tensor</span><span class="p">(</span><span class="mi">12</span><span class="p">)</span> <span class="o">-</span> <span class="p">(</span><span class="n">x</span> <span class="o">*</span> <span class="n">y</span><span class="o">.</span><span class="n">exp</span><span class="p">()))</span> <span class="o">/</span> <span class="p">(</span><span class="n">Tensor</span><span class="p">(</span><span class="mi">45</span><span class="p">)</span> <span class="o">+</span> <span class="n">x</span> <span class="o">*</span> <span class="n">y</span> <span class="o">*</span> <span class="p">(</span><span class="o">-</span><span class="n">x</span><span class="p">)</span><span class="o">.</span><span class="n">exp</span><span class="p">())</span>
</code></pre></div>
<p>This will generate for the tensor <code>z</code>, the following computation graph:</p>
<p><img src="imgs/auto-diff/graph_complexe.png" alt="Computation graph of z" width="300" /></p>
<p>We can now easily calculate the partial derivative of <code>z</code> as a function of <code>x</code> and <code>y</code> with the following code: </p>
<div class="highlight"><pre><span></span><code><span class="nb">print</span><span class="p">(</span><span class="n">z</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">x</span><span class="p">))</span> <span class="c1"># T:-3.34729777301069</span>
<span class="nb">print</span><span class="p">(</span><span class="n">z</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">y</span><span class="p">))</span> <span class="c1"># T:-9.70176956641438</span>
</code></pre></div>
<p>This will generate the following two graphs:</p>
<p><img src="imgs/auto-diff/graph_complexe_grad_x.png" alt="Computation graph of derivative of z in function of x" /><img src="imgs/auto-diff/graph_complexe_grad_y.png" alt="Computation graph of derivative of z in function of y" /></p>
<p>Finally, to check that our automatic derivation system works, we can compare the numerical calculation of our derivatives with the calculation done by the Sympy library:</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">sympy</span> <span class="k">as</span> <span class="nn">sym</span>
<span class="n">xs</span> <span class="o">=</span> <span class="n">sym</span><span class="o">.</span><span class="n">Symbol</span><span class="p">(</span><span class="s1">'xs'</span><span class="p">)</span>
<span class="n">ys</span> <span class="o">=</span> <span class="n">sym</span><span class="o">.</span><span class="n">Symbol</span><span class="p">(</span><span class="s1">'ys'</span><span class="p">)</span>
<span class="n">zs</span> <span class="o">=</span> <span class="p">(</span><span class="mi">12</span> <span class="o">-</span> <span class="p">(</span><span class="n">xs</span> <span class="o">*</span> <span class="n">sym</span><span class="o">.</span><span class="n">exp</span><span class="p">(</span><span class="n">ys</span><span class="p">)))</span> <span class="o">/</span> <span class="p">(</span><span class="mi">45</span> <span class="o">+</span> <span class="p">((</span><span class="n">xs</span> <span class="o">*</span> <span class="n">ys</span><span class="p">)</span> <span class="o">*</span> <span class="n">sym</span><span class="o">.</span><span class="n">exp</span><span class="p">(</span><span class="o">-</span><span class="n">xs</span><span class="p">))</span> <span class="p">)</span>
<span class="n">d</span> <span class="o">=</span> <span class="n">zs</span><span class="o">.</span><span class="n">diff</span><span class="p">(</span><span class="n">ys</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">zs</span><span class="o">.</span><span class="n">diff</span><span class="p">(</span><span class="n">xs</span><span class="p">)</span><span class="o">.</span><span class="n">evalf</span><span class="p">(</span><span class="n">subs</span><span class="o">=</span><span class="p">{</span><span class="n">xs</span><span class="p">:</span><span class="mi">3</span><span class="p">,</span> <span class="n">ys</span><span class="p">:</span><span class="mi">5</span><span class="p">}))</span> <span class="c1"># -3.34729777301069</span>
<span class="nb">print</span><span class="p">(</span><span class="n">zs</span><span class="o">.</span><span class="n">diff</span><span class="p">(</span><span class="n">ys</span><span class="p">)</span><span class="o">.</span><span class="n">evalf</span><span class="p">(</span><span class="n">subs</span><span class="o">=</span><span class="p">{</span><span class="n">xs</span><span class="p">:</span><span class="mi">3</span><span class="p">,</span> <span class="n">ys</span><span class="p">:</span><span class="mi">5</span><span class="p">}))</span> <span class="c1"># -9.70176956641438</span>
</code></pre></div>
<p>The result obtained with the Sympy library is strictly the same as with our <code>Tensor</code> class !</p>
<p><br /></p>
<h2>Possible Improvements & Optimizations</h2>
<p>We have just created the simplest automatic differentiation system that exists, and probably also the slowest. We can add more complex operations if we want, as long as we know how to derive them. As it is, this class can only handle scalars; for such a library to be most useful, one would have to add <strong>operations on arrays</strong> of arbitrary sizes. <br>
<!-- Un grand atout des bibliothèques de différenciation automatique telles que <a href="https://en.wikipedia.org/wiki/TensorFlow">TensorFlow</a> ou <a href="https://en.wikipedia.org/wiki/PyTorch">PyTorch</a> -->
Also, looking at the graphs, we can see that some optimizations are possible:
<ul>
<li>If we are in a multiplication node and one of the two <code>children</code> is 0, we <strong>should not explore further</strong>. Because we know that whatever is multiplied by 0, will always be 0. </li>
<li>When traversing the tree to perform a derivative with respect to a tensor <code>x</code>, if we are in a node that does not depend on <code>x</code> and whose children do not depend on <code>x</code>, we could stop the traversal at this step and <strong>consider the current node as a constant</strong>. This type of optimization could greatly improve the computational speed for graphs with many nodes and different variables. </li>
<li>By looking at the graph we can see that some operations are repeated. We can imagine <strong>to set up a cache</strong> to avoid repeating the computations several times.</li>
</ul></p>
<p><br /></p>
<p>I hope this article has helped you understand how automatic differentiation is performed for neural network optimization and learning.<br>
Feel free to give me your feedback in comments!</p>
<!-- <br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br /> -->Comprendre la différenciation automatique en 30 lignes de Python2023-02-17T00:00:00+01:002023-02-17T00:00:00+01:00Victor MARTINtag:vmartin.fr,2023-02-17:/understanding-automatic-differentiation-in-30-lines-of-python-fr.html<!-- Status: hidden -->
<p>---> <a href="/understanding-automatic-differentiation-in-30-lines-of-python.html">For English version of this article, click here</a></p>
<!--
Améliorer l'article sur sympy jacobian pour parler du fait que scipy fait l'estimation de la dérivée alors qu'on peut des fois la calculer sans problème avec sympy.
Au delà des réseaux de neurones, beaucoup d'algorithmes d'optimisation demande le calcul du gradient, c'est à dire du vecteur qui va contenir toutes les dérivées partielles. La fonction `scipy.optimize.optimize` de la bibliothèque Numpy/Scipy va estimer le gradient en un point $x$ avec la formule suivante :
$$\lim_{h\to 0} \frac{f(x+h) - f(x)}{h} $$
On peut obtenir un résultat numérique de la dérivée de $f$ au point $x$, mais ce sera une approximation, et il faudra appeler deux fois la fonction $f$, ce qui peut être coûteux. -->
<p>Je suis ingénieur Machine Learning et j'utilise dans mon travail des bibliothèques telles que Tensorflow et Pytorch pour entrainer mes réseaux de neurones. Et ça faisait un moment que je voulais écrire le bout de code le plus simple pour effectuer ce …</p><!-- Status: hidden -->
<p>---> <a href="/understanding-automatic-differentiation-in-30-lines-of-python.html">For English version of this article, click here</a></p>
<!--
Améliorer l'article sur sympy jacobian pour parler du fait que scipy fait l'estimation de la dérivée alors qu'on peut des fois la calculer sans problème avec sympy.
Au delà des réseaux de neurones, beaucoup d'algorithmes d'optimisation demande le calcul du gradient, c'est à dire du vecteur qui va contenir toutes les dérivées partielles. La fonction `scipy.optimize.optimize` de la bibliothèque Numpy/Scipy va estimer le gradient en un point $x$ avec la formule suivante :
$$\lim_{h\to 0} \frac{f(x+h) - f(x)}{h} $$
On peut obtenir un résultat numérique de la dérivée de $f$ au point $x$, mais ce sera une approximation, et il faudra appeler deux fois la fonction $f$, ce qui peut être coûteux. -->
<p>Je suis ingénieur Machine Learning et j'utilise dans mon travail des bibliothèques telles que Tensorflow et Pytorch pour entrainer mes réseaux de neurones. Et ça faisait un moment que je voulais écrire le bout de code le plus simple pour effectuer ce que l'on appelle la <strong><a href="https://en.wikipedia.org/wiki/Automatic_differentiation">différenciation automatique</a></strong> qui est au coeur de l'apprentissage des réseaux de neurones.<br>
Dans cet article, je vais essayer de construire de façon itérative le code le plus simple pour calculer des dérivées automatiquement sur des scalaires.</p>
<p>Dans le code Python suivant, la somme entre <code>x</code> et <code>y</code> va être effectuée et le résultat (<code>8</code>) va être assigné à la variable <code>z</code>. Après l'assignation, la variable <code>z</code> ne garde aucune trace des variables utilisées, pas moyen de mettre à jour automatiquement la valeur de <code>z</code> si celle de <code>x</code> ou <code>y</code> change. Encore moins possible de comprendre le lien entre chaque variable pour calculer une dérivée automatiquement.</p>
<div class="highlight"><pre><span></span><code><span class="n">x</span> <span class="o">=</span> <span class="mi">3</span>
<span class="n">y</span> <span class="o">=</span> <span class="mi">5</span>
<span class="n">z</span> <span class="o">=</span> <span class="n">x</span> <span class="o">+</span> <span class="n">y</span>
</code></pre></div>
<h2>La classe <code>Tensor</code></h2>
<p>Le principe va consister à créer un nouveau type, un <code>Tensor</code>, qui va nous permettre de faire du <strong>calcul symbolique</strong> sur nos variables.<br>
Commençons par créer une classe <code>Tensor</code> où l'opération d'addition est redéfinie. </p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
<span class="k">class</span> <span class="nc">Tensor</span><span class="p">:</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">value</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">value</span> <span class="o">=</span> <span class="n">value</span>
<span class="k">def</span> <span class="fm">__repr__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="sa">f</span><span class="s2">"T:</span><span class="si">{</span><span class="bp">self</span><span class="o">.</span><span class="n">value</span><span class="si">}</span><span class="s2">"</span>
<span class="k">def</span> <span class="fm">__add__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">):</span>
<span class="n">t</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="n">value</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">value</span> <span class="o">+</span> <span class="n">other</span><span class="o">.</span><span class="n">value</span><span class="p">)</span>
<span class="k">return</span> <span class="n">t</span>
<span class="n">x</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="mi">5</span><span class="p">)</span>
<span class="n">z</span> <span class="o">=</span> <span class="n">x</span> <span class="o">+</span> <span class="n">y</span>
<span class="nb">print</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">z</span><span class="p">)</span>
<span class="c1"># Out:</span>
<span class="c1"># T:3 T:5</span>
<span class="c1"># T:8</span>
</code></pre></div>
<p>Dans cet exemple, on créé une classe <code>Tensor</code> pouvant stocker une valeur, et on redéfinit l'addition pour créer un nouveau <code>Tensor</code> quand on effectue une addition entre deux <code>Tensor</code>. Il n'y a pas encore de mechanisme de calcul symbolique qui va permettre à <code>z</code> d'avoir une trace qu'elle est le résultat de l'addition entre <code>x</code> et <code>y</code>.<br>
Nous allons ajouter ce comportement en utilisant <strong>un arbre binaire</strong>. Chaque tenseur va pouvoir contenir les deux autres tenseurs et l'opération qui l'a produite. Pour ça, on introduit le tuple Children qui va contenir ces trois informations.</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
<span class="kn">from</span> <span class="nn">collections</span> <span class="kn">import</span> <span class="n">namedtuple</span>
<span class="n">Children</span> <span class="o">=</span> <span class="n">namedtuple</span><span class="p">(</span><span class="s1">'Children'</span><span class="p">,</span> <span class="p">[</span><span class="s1">'a'</span><span class="p">,</span> <span class="s1">'b'</span><span class="p">,</span> <span class="s1">'op'</span><span class="p">])</span>
<span class="k">class</span> <span class="nc">Tensor</span><span class="p">:</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">value</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">children</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">value</span> <span class="o">=</span> <span class="n">value</span>
<span class="bp">self</span><span class="o">.</span><span class="n">children</span> <span class="o">=</span> <span class="n">children</span>
<span class="k">def</span> <span class="nf">forward</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="k">return</span> <span class="bp">self</span>
<span class="c1"># compute forward pass of children in the tree</span>
<span class="n">a</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">a</span><span class="o">.</span><span class="n">forward</span><span class="p">()</span>
<span class="n">b</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span><span class="o">.</span><span class="n">forward</span><span class="p">()</span>
<span class="c1"># If values are set, let's compute the real value of this tensor</span>
<span class="k">if</span> <span class="n">a</span><span class="o">.</span><span class="n">value</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span> <span class="ow">and</span> <span class="n">b</span><span class="o">.</span><span class="n">value</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">value</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">op</span><span class="p">(</span><span class="n">a</span><span class="o">.</span><span class="n">value</span><span class="p">,</span> <span class="n">b</span><span class="o">.</span><span class="n">value</span><span class="p">)</span>
<span class="k">return</span> <span class="bp">self</span>
<span class="k">def</span> <span class="fm">__repr__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="sa">f</span><span class="s2">"T:</span><span class="si">{</span><span class="bp">self</span><span class="o">.</span><span class="n">value</span><span class="si">}</span><span class="s2">"</span>
<span class="k">def</span> <span class="fm">__add__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">):</span>
<span class="n">c</span> <span class="o">=</span> <span class="n">Children</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">add</span><span class="p">)</span>
<span class="n">t</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="n">children</span><span class="o">=</span><span class="n">c</span><span class="p">)</span>
<span class="k">return</span> <span class="n">t</span><span class="o">.</span><span class="n">forward</span><span class="p">()</span>
<span class="k">def</span> <span class="fm">__mul__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">):</span>
<span class="n">c</span> <span class="o">=</span> <span class="n">Children</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">multiply</span><span class="p">)</span>
<span class="n">t</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="n">children</span><span class="o">=</span><span class="n">c</span><span class="p">)</span>
<span class="k">return</span> <span class="n">t</span><span class="o">.</span><span class="n">forward</span><span class="p">()</span>
<span class="n">x</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="mi">5</span><span class="p">)</span>
<span class="n">z1</span> <span class="o">=</span> <span class="n">x</span> <span class="o">+</span> <span class="n">y</span>
<span class="n">z2</span> <span class="o">=</span> <span class="n">z1</span> <span class="o">*</span> <span class="n">y</span>
<span class="nb">print</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">z2</span><span class="p">)</span>
<span class="c1"># Out</span>
<span class="c1"># T:3 T:5</span>
<span class="c1"># T:40</span>
</code></pre></div>
<p>Maintenant, un tenseur, en plus de contenir une valeur numérique, va contenir le tuple <code>children</code> lui permettant de garder une trace du calcul. Dans cet exemple, en plus d'avoir introduit le type <code>Children</code>, nous avons rajouté la méthode de multiplication sur les tenseurs. La méthode <code>forward</code> à la classe <code>Tensor</code> a également été ajoutée pour pouvoir <strong>exécuter le graphe de calcul</strong> et calculer la valeur réelles des tenseurs. Le tenseur <code>z2</code> peut être modélisé par le graphe de calcul suivant.</p>
<!-- ![Pelican](imgs/auto-diff/graph_xyzz1.png =100x20) -->
<p><img src="imgs/auto-diff/graph_xyz1z2.png" alt="Computation graph of the tensor z2" width="300" /></p>
<p>On peut vérifier que ça marche comme attendu en créant d'abord le graphe sans préciser de valeurs :</p>
<div class="highlight"><pre><span></span><code><span class="n">x</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="kc">None</span><span class="p">)</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="kc">None</span><span class="p">)</span>
<span class="n">z1</span> <span class="o">=</span> <span class="n">x</span> <span class="o">+</span> <span class="n">y</span>
<span class="n">z2</span> <span class="o">=</span> <span class="n">z1</span> <span class="o">*</span> <span class="n">y</span>
<span class="nb">print</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">z2</span><span class="p">)</span>
<span class="c1"># Out</span>
<span class="c1"># T:None T:None</span>
<span class="c1"># T:None</span>
</code></pre></div>
<p>Puis les valeurs des feuilles de l'arbre (<code>x</code> et <code>y</code>) peuvent être changées et la valeur de <code>z2</code> calculée. L'appel à <code>z2.forward()</code> va provoquer un appel la méthode <code>forward</code> de <code>z</code> et <code>y</code>, et ces appels vont permettre de descendre dans le graphe pour calculer récursivement la valeur de <code>z2</code>. </p>
<div class="highlight"><pre><span></span><code><span class="n">x</span><span class="o">.</span><span class="n">value</span> <span class="o">=</span> <span class="mi">3</span>
<span class="n">y</span><span class="o">.</span><span class="n">value</span> <span class="o">=</span> <span class="mi">5</span>
<span class="nb">print</span><span class="p">(</span><span class="n">z2</span><span class="o">.</span><span class="n">forward</span><span class="p">())</span>
<span class="c1"># Out</span>
<span class="c1"># T:40</span>
</code></pre></div>
<p><br /></p>
<h2>Ajouter la dérivation automatique</h2>
<p>Pour ajouter la différenciation automatique à un graphe de calcul arbitraire, nous ajoutons simplement la dérivée pour les opérations de base supportées par notre classe <code>Tensor</code>. Des appels récursifs à la fonction <code>grad</code> traverseront le graphe de calcul et décomposeront une fonction complexe à dériver en une combinaison de fonctions simples.</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">grad</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">deriv_to</span><span class="p">):</span>
<span class="c1"># Derivative of a tensor with itself is 1</span>
<span class="k">if</span> <span class="bp">self</span> <span class="ow">is</span> <span class="n">deriv_to</span><span class="p">:</span>
<span class="k">return</span> <span class="n">Tensor</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
<span class="c1"># Derivative of a scalar with another tensor is 0</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="k">return</span> <span class="n">Tensor</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">op</span> <span class="ow">is</span> <span class="n">np</span><span class="o">.</span><span class="n">add</span><span class="p">:</span> <span class="c1"># (a + b)' = a' + b'</span>
<span class="n">t</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">a</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">deriv_to</span><span class="p">)</span> <span class="o">+</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">deriv_to</span><span class="p">)</span>
<span class="k">elif</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">op</span> <span class="ow">is</span> <span class="n">np</span><span class="o">.</span><span class="n">multiply</span><span class="p">:</span> <span class="c1"># (ab)' = a'b + ab'</span>
<span class="n">t</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">a</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">deriv_to</span><span class="p">)</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span> <span class="o">+</span> \
<span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">a</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">deriv_to</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">NotImplementedError</span><span class="p">(</span><span class="sa">f</span><span class="s2">"This op is not implemented. </span><span class="si">{</span><span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">op</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span>
<span class="k">return</span> <span class="n">t</span>
</code></pre></div>
<p>On peut maintenant dériver <code>z2</code> en fonction de la variable de notre choix:</p>
<div class="highlight"><pre><span></span><code><span class="nb">print</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span>
<span class="n">g</span> <span class="o">=</span> <span class="n">z2</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">y</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">g</span><span class="p">)</span>
<span class="c1"># Out</span>
<span class="c1"># T:3 T:5</span>
<span class="c1"># T:13</span>
</code></pre></div>
<p>Ici, <code>g</code> n'est pas seulement une valeur, <strong>c'est un nouveau graphe de calcul</strong> qui représente la dérivée partielle de <code>z2</code> en fonction de <code>y</code>. Comme la valeur de <code>x</code> et <code>y</code> était définie au moment de l'appel à <code>grad</code>, la valeur de <code>g</code> a pu être calculée. Le graphe de calcul de <code>g</code> peut être représenté par ce schéma : </p>
<p><img src="imgs/auto-diff/graph_xyz1z2_grad.png" alt="Computation graph of the partial derivative of z2 in function of y" width="300" /></p>
<p>Littéralement $g = \frac{\partial z_2}{\partial y} = x + 2*y$, et lorsque <code>x</code> et <code>y</code> valent respectivement 3 et 5, alors <code>g</code> vaut 13.</p>
<h3>Permettre à la classe <code>Tensor</code> de manipuler des formules plus complexes</h3>
<p>Pour pouvoir utiliser des formules plus complexes, nous allons rajouter d'autres opérations à la classe <code>Tensor</code>. Nous ajoutons la soustraction, la division, l'exponentielle et la négation (<code>-x</code>).<br>
Voici la classe <code>Tensor</code> dans sa forme finale :</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">Tensor</span><span class="p">:</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">value</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">children</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">children</span> <span class="o">=</span> <span class="n">children</span>
<span class="bp">self</span><span class="o">.</span><span class="n">value</span> <span class="o">=</span> <span class="n">value</span>
<span class="bp">self</span><span class="o">.</span><span class="n">name</span> <span class="o">=</span> <span class="n">name</span>
<span class="k">def</span> <span class="nf">forward</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="k">return</span> <span class="bp">self</span>
<span class="n">a</span> <span class="o">=</span> <span class="kc">None</span>
<span class="n">b</span> <span class="o">=</span> <span class="kc">None</span>
<span class="c1"># compute forward pass of children in the tree</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">a</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="n">a</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">a</span><span class="o">.</span><span class="n">forward</span><span class="p">()</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="n">b</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span><span class="o">.</span><span class="n">forward</span><span class="p">()</span>
<span class="c1"># If a has a specific value after forward pass</span>
<span class="k">if</span> <span class="n">a</span><span class="o">.</span><span class="n">value</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="c1"># If the operation does not need a term b (like exp(a) for example)</span>
<span class="c1"># Use only a</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">value</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">op</span><span class="p">(</span><span class="n">a</span><span class="o">.</span><span class="n">value</span><span class="p">)</span>
<span class="c1"># Else if op needs a second term b and his value is not None after forward pass</span>
<span class="k">elif</span> <span class="n">b</span><span class="o">.</span><span class="n">value</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">value</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">op</span><span class="p">(</span><span class="n">a</span><span class="o">.</span><span class="n">value</span><span class="p">,</span> <span class="n">b</span><span class="o">.</span><span class="n">value</span><span class="p">)</span>
<span class="k">return</span> <span class="bp">self</span>
<span class="c1"># TODO: manage case when the two tensors are independant</span>
<span class="k">def</span> <span class="nf">grad</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">deriv_to</span><span class="p">):</span>
<span class="c1"># Derivative of a tensor with itself is 1</span>
<span class="k">if</span> <span class="bp">self</span> <span class="ow">is</span> <span class="n">deriv_to</span><span class="p">:</span>
<span class="k">return</span> <span class="n">Tensor</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
<span class="c1"># Derivative of a scalar with another tensor is 0</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="k">return</span> <span class="n">Tensor</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">op</span> <span class="ow">is</span> <span class="n">np</span><span class="o">.</span><span class="n">add</span><span class="p">:</span> <span class="c1"># (a + b)' = a' + b'</span>
<span class="n">t</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">a</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">deriv_to</span><span class="p">)</span> <span class="o">+</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">deriv_to</span><span class="p">)</span>
<span class="k">elif</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">op</span> <span class="ow">is</span> <span class="n">np</span><span class="o">.</span><span class="n">subtract</span><span class="p">:</span> <span class="c1"># (a - b)' = a' - b'</span>
<span class="n">t</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">a</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">deriv_to</span><span class="p">)</span> <span class="o">-</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">deriv_to</span><span class="p">)</span>
<span class="k">elif</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">op</span> <span class="ow">is</span> <span class="n">np</span><span class="o">.</span><span class="n">multiply</span><span class="p">:</span> <span class="c1"># (ab)' = a'b + ab'</span>
<span class="n">t</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">a</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">deriv_to</span><span class="p">)</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span> <span class="o">+</span> \
<span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">a</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">deriv_to</span><span class="p">)</span>
<span class="k">elif</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">op</span> <span class="ow">is</span> <span class="n">np</span><span class="o">.</span><span class="n">divide</span><span class="p">:</span> <span class="c1"># (ab)' = (a'b - ab') / b²</span>
<span class="n">t</span> <span class="o">=</span> <span class="p">(</span>
<span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">a</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">deriv_to</span><span class="p">)</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span> <span class="o">-</span> \
<span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">a</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">deriv_to</span><span class="p">)</span>
<span class="p">)</span> <span class="o">/</span> \
<span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">b</span><span class="p">)</span>
<span class="k">elif</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">op</span> <span class="ow">is</span> <span class="n">np</span><span class="o">.</span><span class="n">exp</span><span class="p">:</span> <span class="c1"># exp(a)' = a'exp(a)</span>
<span class="n">t</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">a</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">deriv_to</span><span class="p">)</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">a</span><span class="o">.</span><span class="n">exp</span><span class="p">()</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">NotImplementedError</span><span class="p">(</span><span class="sa">f</span><span class="s2">"This op is not implemented. </span><span class="si">{</span><span class="bp">self</span><span class="o">.</span><span class="n">children</span><span class="o">.</span><span class="n">op</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span>
<span class="k">return</span> <span class="n">t</span>
<span class="k">def</span> <span class="fm">__repr__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="sa">f</span><span class="s2">"T:</span><span class="si">{</span><span class="bp">self</span><span class="o">.</span><span class="n">value</span><span class="si">}</span><span class="s2">"</span>
<span class="k">def</span> <span class="fm">__add__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">):</span>
<span class="n">c</span> <span class="o">=</span> <span class="n">Children</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">add</span><span class="p">)</span>
<span class="n">t</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="n">children</span><span class="o">=</span><span class="n">c</span><span class="p">)</span>
<span class="k">return</span> <span class="n">t</span><span class="o">.</span><span class="n">forward</span><span class="p">()</span>
<span class="k">def</span> <span class="fm">__sub__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">):</span>
<span class="n">c</span> <span class="o">=</span> <span class="n">Children</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">subtract</span><span class="p">)</span>
<span class="n">t</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="n">children</span><span class="o">=</span><span class="n">c</span><span class="p">)</span>
<span class="k">return</span> <span class="n">t</span><span class="o">.</span><span class="n">forward</span><span class="p">()</span>
<span class="k">def</span> <span class="fm">__mul__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">):</span>
<span class="n">c</span> <span class="o">=</span> <span class="n">Children</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">multiply</span><span class="p">)</span>
<span class="n">t</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="n">children</span><span class="o">=</span><span class="n">c</span><span class="p">)</span>
<span class="k">return</span> <span class="n">t</span><span class="o">.</span><span class="n">forward</span><span class="p">()</span>
<span class="k">def</span> <span class="fm">__truediv__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">):</span>
<span class="n">c</span> <span class="o">=</span> <span class="n">Children</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">divide</span><span class="p">)</span>
<span class="n">t</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="n">children</span><span class="o">=</span><span class="n">c</span><span class="p">)</span>
<span class="k">return</span> <span class="n">t</span><span class="o">.</span><span class="n">forward</span><span class="p">()</span>
<span class="k">def</span> <span class="fm">__neg__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="n">c</span> <span class="o">=</span> <span class="n">Children</span><span class="p">(</span><span class="n">Tensor</span><span class="p">(</span><span class="n">value</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">zeros_like</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">value</span><span class="p">)),</span> <span class="bp">self</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">subtract</span><span class="p">)</span>
<span class="n">t</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="n">children</span><span class="o">=</span><span class="n">c</span><span class="p">)</span>
<span class="k">return</span> <span class="n">t</span><span class="o">.</span><span class="n">forward</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">exp</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="n">c</span> <span class="o">=</span> <span class="n">Children</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="kc">None</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">exp</span><span class="p">)</span>
<span class="n">t</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="n">children</span><span class="o">=</span><span class="n">c</span><span class="p">)</span>
<span class="k">return</span> <span class="n">t</span><span class="o">.</span><span class="n">forward</span><span class="p">()</span>
</code></pre></div>
<p>Pour chaque opération ajoutée à la classe <code>Tensor</code>, la dérivée correspondante a été inclue dans la méthode <code>grad</code>. Également, nous avons modifié <code>forward</code> pour gérer plus de cas, notamment pour gérer les opérations qui ne nécéssitent qu'un terme comme l'exponentielle ou la négation. </p>
<p>Maintenant, créons une formule plus complexe et dérivons la !<br>
Essayons de dériver $z$ :</p>
<p>$$z = \frac{12 - (x * e^{y})}{45 + x * y * e^{-x}}$$
<!-- $$z = (12 - (x * exp(y))) / (45 + x * y * exp(-x) )$$ --></p>
<p>Nous n'avons qu'à écrire cette équation en utilisant notre classe <code>Tensor</code> : </p>
<div class="highlight"><pre><span></span><code><span class="n">x</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">Tensor</span><span class="p">(</span><span class="mi">5</span><span class="p">)</span>
<span class="n">z</span> <span class="o">=</span> <span class="p">(</span><span class="n">Tensor</span><span class="p">(</span><span class="mi">12</span><span class="p">)</span> <span class="o">-</span> <span class="p">(</span><span class="n">x</span> <span class="o">*</span> <span class="n">y</span><span class="o">.</span><span class="n">exp</span><span class="p">()))</span> <span class="o">/</span> <span class="p">(</span><span class="n">Tensor</span><span class="p">(</span><span class="mi">45</span><span class="p">)</span> <span class="o">+</span> <span class="n">x</span> <span class="o">*</span> <span class="n">y</span> <span class="o">*</span> <span class="p">(</span><span class="o">-</span><span class="n">x</span><span class="p">)</span><span class="o">.</span><span class="n">exp</span><span class="p">())</span>
</code></pre></div>
<p>Ce qui va générer pour le tenseur <code>z</code>, le graphe de calcul suivant : </p>
<p><img src="imgs/auto-diff/graph_complexe.png" alt="Computation graph of z" width="300" /></p>
<p>Nous pouvons maintenant facilement calculer la dérivée partielle de <code>z</code> en fonction de <code>x</code> et de <code>y</code> avec le code suivant : </p>
<div class="highlight"><pre><span></span><code><span class="nb">print</span><span class="p">(</span><span class="n">z</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">x</span><span class="p">))</span> <span class="c1"># T:-3.34729777301069</span>
<span class="nb">print</span><span class="p">(</span><span class="n">z</span><span class="o">.</span><span class="n">grad</span><span class="p">(</span><span class="n">y</span><span class="p">))</span> <span class="c1"># T:-9.70176956641438</span>
</code></pre></div>
<p>Ce qui va générer les deux graphes suivants : </p>
<p><img src="imgs/auto-diff/graph_complexe_grad_x.png" alt="Computation graph of derivative of z in function of x" /><img src="imgs/auto-diff/graph_complexe_grad_y.png" alt="Computation graph of derivative of z in function of y" /></p>
<p>Enfin, pour vérifier que notre système de dérivation automatique fonctionne, nous pouvons comparer le calcul numérique de nos dérivées avec le calcul fait par la bibliothèque Sympy : </p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">sympy</span> <span class="k">as</span> <span class="nn">sym</span>
<span class="n">xs</span> <span class="o">=</span> <span class="n">sym</span><span class="o">.</span><span class="n">Symbol</span><span class="p">(</span><span class="s1">'xs'</span><span class="p">)</span>
<span class="n">ys</span> <span class="o">=</span> <span class="n">sym</span><span class="o">.</span><span class="n">Symbol</span><span class="p">(</span><span class="s1">'ys'</span><span class="p">)</span>
<span class="n">zs</span> <span class="o">=</span> <span class="p">(</span><span class="mi">12</span> <span class="o">-</span> <span class="p">(</span><span class="n">xs</span> <span class="o">*</span> <span class="n">sym</span><span class="o">.</span><span class="n">exp</span><span class="p">(</span><span class="n">ys</span><span class="p">)))</span> <span class="o">/</span> <span class="p">(</span><span class="mi">45</span> <span class="o">+</span> <span class="p">((</span><span class="n">xs</span> <span class="o">*</span> <span class="n">ys</span><span class="p">)</span> <span class="o">*</span> <span class="n">sym</span><span class="o">.</span><span class="n">exp</span><span class="p">(</span><span class="o">-</span><span class="n">xs</span><span class="p">))</span> <span class="p">)</span>
<span class="n">d</span> <span class="o">=</span> <span class="n">zs</span><span class="o">.</span><span class="n">diff</span><span class="p">(</span><span class="n">ys</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">zs</span><span class="o">.</span><span class="n">diff</span><span class="p">(</span><span class="n">xs</span><span class="p">)</span><span class="o">.</span><span class="n">evalf</span><span class="p">(</span><span class="n">subs</span><span class="o">=</span><span class="p">{</span><span class="n">xs</span><span class="p">:</span><span class="mi">3</span><span class="p">,</span> <span class="n">ys</span><span class="p">:</span><span class="mi">5</span><span class="p">}))</span> <span class="c1"># -3.34729777301069</span>
<span class="nb">print</span><span class="p">(</span><span class="n">zs</span><span class="o">.</span><span class="n">diff</span><span class="p">(</span><span class="n">ys</span><span class="p">)</span><span class="o">.</span><span class="n">evalf</span><span class="p">(</span><span class="n">subs</span><span class="o">=</span><span class="p">{</span><span class="n">xs</span><span class="p">:</span><span class="mi">3</span><span class="p">,</span> <span class="n">ys</span><span class="p">:</span><span class="mi">5</span><span class="p">}))</span> <span class="c1"># -9.70176956641438</span>
</code></pre></div>
<p>Le résultat obtenu avec la bibliothèque Sympy est strictement le même qu'avec notre classe <code>Tensor</code> !</p>
<p><br /></p>
<h2>Améliorations & Optimisations possibles</h2>
<p>Nous venons de créer le système de différenciation automatique le plus simple qui existe, et surement aussi le plus lent. On peut si on le désire rajouter des opérations plus complexes, du moment que l'on sait comment les dériver. En l'état, cette classe ne peut manipuler que des scalaires; pour qu'une telle bibliothèque soit le plus utile, il faudrait rajouter la <strong>gestion des opérations sur les tableaux</strong> de tailles arbitraires. <br>
<!-- Un grand atout des bibliothèques de différenciation automatique telles que <a href="https://en.wikipedia.org/wiki/TensorFlow">TensorFlow</a> ou <a href="https://en.wikipedia.org/wiki/PyTorch">PyTorch</a> -->
Également, en regardant les graphes, on peut constater que certaines optimisations sont possibles : <br>
- Si on est dans un noeud de multiplication et qu'un des deux <code>children</code> vaut 0, on ne devrait pas explorer plus loin. Car on sait que quoique soit multiplié par 0, vaudra toujours 0. <br>
- En parcourant l'arbre pour effectuer une dérivée par rapport à un tenseur <code>x</code>, si on se trouve dans un noeud qui ne dépend pas de <code>x</code> et dont tous les enfants ne dépendent pas de <code>x</code>, on pourrait arrêter le parcours à cette étape et considérer le noeud courant comme une constante. Ce type d'optimisation pourrait grandement améliorer la vitesse de calcul pour des graphes avec beaucoup de noeuds et de variables différentes.
- En regardant le graphe on peut voir que certaines opérations sont répétées. On peut imaginer <strong>mettre en place un cache</strong> pour ne pas répéter les calculs plusieurs fois. </p>
<p><br /></p>
<p>J'espère que cet article vous a aidé à comprendre la façon dont la différenciation automatique est effectuée pour l'optimisation et l'apprentissage des réseaux de neurones.<br>
N'hésitez pas à me donner votre avis en commentaire.</p>
<!-- <br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br /> -->Automatic Jacobian matrix computation with SymPy2019-01-01T00:00:00+01:002019-01-01T00:00:00+01:00Victor MARTINtag:vmartin.fr,2019-01-01:/automatic-jacobian-matrix-computation-with-sympy.html<p>In this short article, we will see how we can easily compute the Jacobian matrix of an equation to speed up an optimization problem.
In my job, I had to regularly fit thousands of second derivative Gaussian functions to experimental data, and calculating the <a href="https://en.wikipedia.org/wiki/Jacobian_matrix_and_determinant">Jacobian matrix</a> for gradient descent instead …</p><p>In this short article, we will see how we can easily compute the Jacobian matrix of an equation to speed up an optimization problem.
In my job, I had to regularly fit thousands of second derivative Gaussian functions to experimental data, and calculating the <a href="https://en.wikipedia.org/wiki/Jacobian_matrix_and_determinant">Jacobian matrix</a> for gradient descent instead of letting the optimizer approximate it has reduced the calculation time from 30 minutes to 5 minutes.</p>
<p>First let's state that we have noisy data:</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
<span class="kn">from</span> <span class="nn">matplotlib</span> <span class="kn">import</span> <span class="n">pyplot</span> <span class="k">as</span> <span class="n">plt</span>
<span class="n">x</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">6</span><span class="p">,</span> <span class="mi">7</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">9</span><span class="p">,</span> <span class="mi">10</span><span class="p">,</span> <span class="mi">11</span><span class="p">,</span> <span class="mi">12</span><span class="p">,</span> <span class="mi">13</span><span class="p">,</span> <span class="mi">14</span><span class="p">,</span> <span class="mi">15</span><span class="p">,</span> <span class="mi">16</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">18</span><span class="p">,</span> <span class="mi">19</span><span class="p">,</span> <span class="mi">20</span><span class="p">,</span> <span class="mi">21</span><span class="p">,</span> <span class="mi">22</span><span class="p">,</span> <span class="mi">23</span><span class="p">,</span> <span class="mi">24</span><span class="p">,</span> <span class="mi">25</span><span class="p">,</span> <span class="mi">26</span><span class="p">,</span> <span class="mi">27</span><span class="p">,</span> <span class="mi">28</span><span class="p">,</span> <span class="mi">29</span><span class="p">,</span> <span class="mi">30</span><span class="p">,</span> <span class="mi">31</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">33</span><span class="p">,</span> <span class="mi">34</span><span class="p">,</span> <span class="mi">35</span><span class="p">,</span> <span class="mi">36</span><span class="p">,</span> <span class="mi">37</span><span class="p">,</span> <span class="mi">38</span><span class="p">,</span> <span class="mi">39</span><span class="p">,</span> <span class="mi">40</span><span class="p">,</span> <span class="mi">41</span><span class="p">,</span> <span class="mi">42</span><span class="p">,</span> <span class="mi">43</span><span class="p">,</span> <span class="mi">44</span><span class="p">,</span> <span class="mi">45</span><span class="p">,</span> <span class="mi">46</span><span class="p">,</span> <span class="mi">47</span><span class="p">,</span> <span class="mi">48</span><span class="p">,</span> <span class="mi">49</span><span class="p">,</span> <span class="mi">50</span><span class="p">,</span> <span class="mi">51</span><span class="p">,</span> <span class="mi">52</span><span class="p">,</span> <span class="mi">53</span><span class="p">,</span> <span class="mi">54</span><span class="p">,</span> <span class="mi">55</span><span class="p">,</span> <span class="mi">56</span><span class="p">,</span> <span class="mi">57</span><span class="p">,</span> <span class="mi">58</span><span class="p">,</span> <span class="mi">59</span><span class="p">,</span> <span class="mi">60</span><span class="p">,</span> <span class="mi">61</span><span class="p">,</span> <span class="mi">62</span><span class="p">,</span> <span class="mi">63</span><span class="p">,</span> <span class="mi">64</span><span class="p">,</span> <span class="mi">65</span><span class="p">,</span> <span class="mi">66</span><span class="p">,</span> <span class="mi">67</span><span class="p">,</span> <span class="mi">68</span><span class="p">,</span> <span class="mi">69</span><span class="p">,</span> <span class="mi">70</span><span class="p">,</span> <span class="mi">71</span><span class="p">,</span> <span class="mi">72</span><span class="p">,</span> <span class="mi">73</span><span class="p">,</span> <span class="mi">74</span><span class="p">,</span> <span class="mi">75</span><span class="p">,</span> <span class="mi">76</span><span class="p">,</span> <span class="mi">77</span><span class="p">,</span> <span class="mi">78</span><span class="p">,</span> <span class="mi">79</span><span class="p">,</span> <span class="mi">80</span><span class="p">,</span> <span class="mi">81</span><span class="p">,</span> <span class="mi">82</span><span class="p">,</span> <span class="mi">83</span><span class="p">,</span> <span class="mi">84</span><span class="p">,</span> <span class="mi">85</span><span class="p">,</span> <span class="mi">86</span><span class="p">,</span> <span class="mi">87</span><span class="p">,</span> <span class="mi">88</span><span class="p">,</span> <span class="mi">89</span><span class="p">,</span> <span class="mi">90</span><span class="p">,</span> <span class="mi">91</span><span class="p">,</span> <span class="mi">92</span><span class="p">,</span> <span class="mi">93</span><span class="p">,</span> <span class="mi">94</span><span class="p">,</span> <span class="mi">95</span><span class="p">,</span> <span class="mi">96</span><span class="p">,</span> <span class="mi">97</span><span class="p">,</span> <span class="mi">98</span><span class="p">,</span> <span class="mi">99</span><span class="p">,</span> <span class="mi">100</span><span class="p">,</span> <span class="mi">101</span><span class="p">,</span> <span class="mi">102</span><span class="p">,</span> <span class="mi">103</span><span class="p">,</span> <span class="mi">104</span><span class="p">,</span> <span class="mi">105</span><span class="p">,</span> <span class="mi">106</span><span class="p">,</span> <span class="mi">107</span><span class="p">,</span> <span class="mi">108</span><span class="p">,</span> <span class="mi">109</span><span class="p">,</span> <span class="mi">110</span><span class="p">,</span> <span class="mi">111</span><span class="p">,</span> <span class="mi">112</span><span class="p">,</span> <span class="mi">113</span><span class="p">,</span> <span class="mi">114</span><span class="p">,</span> <span class="mi">115</span><span class="p">,</span> <span class="mi">116</span><span class="p">,</span> <span class="mi">117</span><span class="p">,</span> <span class="mi">118</span><span class="p">,</span> <span class="mi">119</span><span class="p">,</span> <span class="mi">120</span><span class="p">,</span> <span class="mi">121</span><span class="p">,</span> <span class="mi">122</span><span class="p">,</span> <span class="mi">123</span><span class="p">,</span> <span class="mi">124</span><span class="p">,</span> <span class="mi">125</span><span class="p">,</span> <span class="mi">126</span><span class="p">,</span> <span class="mi">127</span><span class="p">,</span> <span class="mi">128</span><span class="p">,</span> <span class="mi">129</span><span class="p">,</span> <span class="mi">130</span><span class="p">,</span> <span class="mi">131</span><span class="p">,</span> <span class="mi">132</span><span class="p">,</span> <span class="mi">133</span><span class="p">,</span> <span class="mi">134</span><span class="p">,</span> <span class="mi">135</span><span class="p">,</span> <span class="mi">136</span><span class="p">,</span> <span class="mi">137</span><span class="p">,</span> <span class="mi">138</span><span class="p">,</span> <span class="mi">139</span><span class="p">,</span> <span class="mi">140</span><span class="p">,</span> <span class="mi">141</span><span class="p">,</span> <span class="mi">142</span><span class="p">,</span> <span class="mi">143</span><span class="p">,</span> <span class="mi">144</span><span class="p">,</span> <span class="mi">145</span><span class="p">,</span> <span class="mi">146</span><span class="p">,</span> <span class="mi">147</span><span class="p">,</span> <span class="mi">148</span><span class="p">,</span> <span class="mi">149</span><span class="p">,</span> <span class="mi">150</span><span class="p">,</span> <span class="mi">151</span><span class="p">,</span> <span class="mi">152</span><span class="p">,</span> <span class="mi">153</span><span class="p">,</span> <span class="mi">154</span><span class="p">,</span> <span class="mi">155</span><span class="p">,</span> <span class="mi">156</span><span class="p">,</span> <span class="mi">157</span><span class="p">,</span> <span class="mi">158</span><span class="p">,</span> <span class="mi">159</span><span class="p">,</span> <span class="mi">160</span><span class="p">,</span> <span class="mi">161</span><span class="p">,</span> <span class="mi">162</span><span class="p">,</span> <span class="mi">163</span><span class="p">,</span> <span class="mi">164</span><span class="p">,</span> <span class="mi">165</span><span class="p">,</span> <span class="mi">166</span><span class="p">,</span> <span class="mi">167</span><span class="p">,</span> <span class="mi">168</span><span class="p">,</span> <span class="mi">169</span><span class="p">,</span> <span class="mi">170</span><span class="p">,</span> <span class="mi">171</span><span class="p">,</span> <span class="mi">172</span><span class="p">,</span> <span class="mi">173</span><span class="p">,</span> <span class="mi">174</span><span class="p">,</span> <span class="mi">175</span><span class="p">,</span> <span class="mi">176</span><span class="p">,</span> <span class="mi">177</span><span class="p">,</span> <span class="mi">178</span><span class="p">,</span> <span class="mi">179</span><span class="p">,</span> <span class="mi">180</span><span class="p">,</span> <span class="mi">181</span><span class="p">,</span> <span class="mi">182</span><span class="p">,</span> <span class="mi">183</span><span class="p">,</span> <span class="mi">184</span><span class="p">,</span> <span class="mi">185</span><span class="p">,</span> <span class="mi">186</span><span class="p">,</span> <span class="mi">187</span><span class="p">,</span> <span class="mi">188</span><span class="p">,</span> <span class="mi">189</span><span class="p">,</span> <span class="mi">190</span><span class="p">,</span> <span class="mi">191</span><span class="p">,</span> <span class="mi">192</span><span class="p">,</span> <span class="mi">193</span><span class="p">,</span> <span class="mi">194</span><span class="p">,</span> <span class="mi">195</span><span class="p">,</span> <span class="mi">196</span><span class="p">,</span> <span class="mi">197</span><span class="p">,</span> <span class="mi">198</span><span class="p">,</span> <span class="mi">199</span><span class="p">,</span> <span class="mi">200</span><span class="p">,</span> <span class="mi">201</span><span class="p">,</span> <span class="mi">202</span><span class="p">,</span> <span class="mi">203</span><span class="p">,</span> <span class="mi">204</span><span class="p">,</span> <span class="mi">205</span><span class="p">,</span> <span class="mi">206</span><span class="p">,</span> <span class="mi">207</span><span class="p">,</span> <span class="mi">208</span><span class="p">,</span> <span class="mi">209</span><span class="p">,</span> <span class="mi">210</span><span class="p">,</span> <span class="mi">211</span><span class="p">,</span> <span class="mi">212</span><span class="p">,</span> <span class="mi">213</span><span class="p">,</span> <span class="mi">214</span><span class="p">,</span> <span class="mi">215</span><span class="p">,</span> <span class="mi">216</span><span class="p">,</span> <span class="mi">217</span><span class="p">,</span> <span class="mi">218</span><span class="p">,</span> <span class="mi">219</span><span class="p">,</span> <span class="mi">220</span><span class="p">,</span> <span class="mi">221</span><span class="p">,</span> <span class="mi">222</span><span class="p">,</span> <span class="mi">223</span><span class="p">,</span> <span class="mi">224</span><span class="p">,</span> <span class="mi">225</span><span class="p">,</span> <span class="mi">226</span><span class="p">,</span> <span class="mi">227</span><span class="p">,</span> <span class="mi">228</span><span class="p">,</span> <span class="mi">229</span><span class="p">,</span> <span class="mi">230</span><span class="p">,</span> <span class="mi">231</span><span class="p">,</span> <span class="mi">232</span><span class="p">,</span> <span class="mi">233</span><span class="p">,</span> <span class="mi">234</span><span class="p">,</span> <span class="mi">235</span><span class="p">,</span> <span class="mi">236</span><span class="p">,</span> <span class="mi">237</span><span class="p">,</span> <span class="mi">238</span><span class="p">,</span> <span class="mi">239</span><span class="p">,</span> <span class="mi">240</span><span class="p">,</span> <span class="mi">241</span><span class="p">,</span> <span class="mi">242</span><span class="p">,</span> <span class="mi">243</span><span class="p">,</span> <span class="mi">244</span><span class="p">,</span> <span class="mi">245</span><span class="p">,</span> <span class="mi">246</span><span class="p">,</span> <span class="mi">247</span><span class="p">,</span> <span class="mi">248</span><span class="p">,</span> <span class="mi">249</span><span class="p">,</span> <span class="mi">250</span><span class="p">,</span> <span class="mi">251</span><span class="p">,</span> <span class="mi">252</span><span class="p">,</span> <span class="mi">253</span><span class="p">,</span> <span class="mi">254</span><span class="p">,</span> <span class="mi">255</span><span class="p">,</span> <span class="mi">256</span><span class="p">,</span> <span class="mi">257</span><span class="p">,</span> <span class="mi">258</span><span class="p">,</span> <span class="mi">259</span><span class="p">,</span> <span class="mi">260</span><span class="p">,</span> <span class="mi">261</span><span class="p">,</span> <span class="mi">262</span><span class="p">,</span> <span class="mi">263</span><span class="p">,</span> <span class="mi">264</span><span class="p">,</span> <span class="mi">265</span><span class="p">,</span> <span class="mi">266</span><span class="p">,</span> <span class="mi">267</span><span class="p">,</span> <span class="mi">268</span><span class="p">,</span> <span class="mi">269</span><span class="p">,</span> <span class="mi">270</span><span class="p">,</span> <span class="mi">271</span><span class="p">,</span> <span class="mi">272</span><span class="p">,</span> <span class="mi">273</span><span class="p">,</span> <span class="mi">274</span><span class="p">,</span> <span class="mi">275</span><span class="p">,</span> <span class="mi">276</span><span class="p">,</span> <span class="mi">277</span><span class="p">,</span> <span class="mi">278</span><span class="p">,</span> <span class="mi">279</span><span class="p">,</span> <span class="mi">280</span><span class="p">,</span> <span class="mi">281</span><span class="p">,</span> <span class="mi">282</span><span class="p">,</span> <span class="mi">283</span><span class="p">,</span> <span class="mi">284</span><span class="p">,</span> <span class="mi">285</span><span class="p">,</span> <span class="mi">286</span><span class="p">,</span> <span class="mi">287</span><span class="p">,</span> <span class="mi">288</span><span class="p">,</span> <span class="mi">289</span><span class="p">,</span> <span class="mi">290</span><span class="p">,</span> <span class="mi">291</span><span class="p">,</span> <span class="mi">292</span><span class="p">,</span> <span class="mi">293</span><span class="p">,</span> <span class="mi">294</span><span class="p">,</span> <span class="mi">295</span><span class="p">,</span> <span class="mi">296</span><span class="p">,</span> <span class="mi">297</span><span class="p">,</span> <span class="mi">298</span><span class="p">,</span> <span class="mi">299</span><span class="p">]</span>
<span class="n">y</span> <span class="o">=</span> <span class="p">[</span><span class="mf">0.23</span><span class="p">,</span> <span class="mf">0.35</span><span class="p">,</span> <span class="mf">0.01</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.12</span><span class="p">,</span> <span class="mf">0.05</span><span class="p">,</span> <span class="mf">0.0</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.02</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.11</span><span class="p">,</span> <span class="mf">0.08</span><span class="p">,</span> <span class="mf">0.03</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.07</span><span class="p">,</span> <span class="mf">0.07</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.11</span><span class="p">,</span> <span class="mf">0.03</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.13</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.09</span><span class="p">,</span> <span class="mf">0.0</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.04</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.09</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.07</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.06</span><span class="p">,</span> <span class="mf">0.02</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.22</span><span class="p">,</span> <span class="mf">0.03</span><span class="p">,</span> <span class="mf">0.07</span><span class="p">,</span> <span class="mf">0.0</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.1</span><span class="p">,</span> <span class="mf">0.19</span><span class="p">,</span> <span class="mf">0.12</span><span class="p">,</span> <span class="mf">0.03</span><span class="p">,</span> <span class="mf">0.04</span><span class="p">,</span> <span class="mf">0.24</span><span class="p">,</span> <span class="mf">0.02</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.0</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.04</span><span class="p">,</span> <span class="mf">0.14</span><span class="p">,</span> <span class="mf">0.01</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.0</span><span class="p">,</span> <span class="mf">0.1</span><span class="p">,</span> <span class="mf">0.19</span><span class="p">,</span> <span class="mf">0.05</span><span class="p">,</span> <span class="mf">0.21</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.01</span><span class="p">,</span> <span class="mf">0.08</span><span class="p">,</span> <span class="mf">0.19</span><span class="p">,</span> <span class="mf">0.18</span><span class="p">,</span> <span class="mf">0.08</span><span class="p">,</span> <span class="mf">0.04</span><span class="p">,</span> <span class="mf">0.06</span><span class="p">,</span> <span class="mf">0.11</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.03</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.01</span><span class="p">,</span> <span class="mf">0.07</span><span class="p">,</span> <span class="mf">0.05</span><span class="p">,</span> <span class="mf">0.05</span><span class="p">,</span> <span class="mf">0.03</span><span class="p">,</span> <span class="mf">0.01</span><span class="p">,</span> <span class="mf">0.24</span><span class="p">,</span> <span class="mf">0.05</span><span class="p">,</span> <span class="mf">0.03</span><span class="p">,</span> <span class="mf">0.06</span><span class="p">,</span> <span class="mf">0.06</span><span class="p">,</span> <span class="mf">0.16</span><span class="p">,</span> <span class="mf">0.18</span><span class="p">,</span> <span class="mf">0.19</span><span class="p">,</span> <span class="mf">0.07</span><span class="p">,</span> <span class="mf">0.12</span><span class="p">,</span> <span class="mf">0.17</span><span class="p">,</span> <span class="mf">0.13</span><span class="p">,</span> <span class="mf">0.35</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.03</span><span class="p">,</span> <span class="mf">0.06</span><span class="p">,</span> <span class="mf">0.19</span><span class="p">,</span> <span class="mf">0.32</span><span class="p">,</span> <span class="mf">0.26</span><span class="p">,</span> <span class="mf">0.19</span><span class="p">,</span> <span class="mf">0.04</span><span class="p">,</span> <span class="mf">0.45</span><span class="p">,</span> <span class="mf">0.11</span><span class="p">,</span> <span class="mf">0.36</span><span class="p">,</span> <span class="mf">0.38</span><span class="p">,</span> <span class="mf">0.31</span><span class="p">,</span> <span class="mf">0.23</span><span class="p">,</span> <span class="mf">0.39</span><span class="p">,</span> <span class="mf">0.4</span><span class="p">,</span> <span class="mf">0.25</span><span class="p">,</span> <span class="mf">0.31</span><span class="p">,</span> <span class="mf">0.47</span><span class="p">,</span> <span class="mf">0.49</span><span class="p">,</span> <span class="mf">0.24</span><span class="p">,</span> <span class="mf">0.31</span><span class="p">,</span> <span class="mf">0.34</span><span class="p">,</span> <span class="mf">0.39</span><span class="p">,</span> <span class="mf">0.44</span><span class="p">,</span> <span class="mf">0.46</span><span class="p">,</span> <span class="mf">0.34</span><span class="p">,</span> <span class="mf">0.35</span><span class="p">,</span> <span class="mf">0.48</span><span class="p">,</span> <span class="mf">0.5</span><span class="p">,</span> <span class="mf">0.42</span><span class="p">,</span> <span class="mf">0.4</span><span class="p">,</span> <span class="mf">0.29</span><span class="p">,</span> <span class="mf">0.4</span><span class="p">,</span> <span class="mf">0.33</span><span class="p">,</span> <span class="mf">0.37</span><span class="p">,</span> <span class="mf">0.34</span><span class="p">,</span> <span class="mf">0.57</span><span class="p">,</span> <span class="mf">0.4</span><span class="p">,</span> <span class="mf">0.51</span><span class="p">,</span> <span class="mf">0.53</span><span class="p">,</span> <span class="mf">0.39</span><span class="p">,</span> <span class="mf">0.55</span><span class="p">,</span> <span class="mf">0.47</span><span class="p">,</span> <span class="mf">0.55</span><span class="p">,</span> <span class="mf">0.61</span><span class="p">,</span> <span class="mf">0.54</span><span class="p">,</span> <span class="mf">0.44</span><span class="p">,</span> <span class="mf">0.38</span><span class="p">,</span> <span class="mf">0.24</span><span class="p">,</span> <span class="mf">0.41</span><span class="p">,</span> <span class="mf">0.38</span><span class="p">,</span> <span class="mf">0.38</span><span class="p">,</span> <span class="mf">0.49</span><span class="p">,</span> <span class="mf">0.37</span><span class="p">,</span> <span class="mf">0.43</span><span class="p">,</span> <span class="mf">0.37</span><span class="p">,</span> <span class="mf">0.31</span><span class="p">,</span> <span class="mf">0.26</span><span class="p">,</span> <span class="mf">0.48</span><span class="p">,</span> <span class="mf">0.39</span><span class="p">,</span> <span class="mf">0.27</span><span class="p">,</span> <span class="mf">0.46</span><span class="p">,</span> <span class="mf">0.44</span><span class="p">,</span> <span class="mf">0.36</span><span class="p">,</span> <span class="mf">0.27</span><span class="p">,</span> <span class="mf">0.16</span><span class="p">,</span> <span class="mf">0.4</span><span class="p">,</span> <span class="mf">0.21</span><span class="p">,</span> <span class="mf">0.11</span><span class="p">,</span> <span class="mf">0.27</span><span class="p">,</span> <span class="mf">0.28</span><span class="p">,</span> <span class="mf">0.13</span><span class="p">,</span> <span class="mf">0.24</span><span class="p">,</span> <span class="mf">0.2</span><span class="p">,</span> <span class="mf">0.06</span><span class="p">,</span> <span class="mf">0.12</span><span class="p">,</span> <span class="mf">0.15</span><span class="p">,</span> <span class="mf">0.07</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.01</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.03</span><span class="p">,</span> <span class="mf">0.11</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.03</span><span class="p">,</span> <span class="mf">0.01</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.02</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.08</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.1</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.2</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.14</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.13</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.19</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.12</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.18</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.26</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.42</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.27</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.37</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.43</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.32</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.36</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.45</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.51</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.59</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.45</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.51</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.68</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.59</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.7</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.67</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.64</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.8</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.85</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.88</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.95</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.76</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.8</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.86</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.03</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.07</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.0</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.99</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.94</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.92</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.08</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.12</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.74</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.83</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.96</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.79</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.87</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.01</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.94</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.95</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.0</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.03</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.98</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.95</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.07</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.14</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.12</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.02</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.74</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.96</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.85</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.9</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.98</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.04</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.93</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.9</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.75</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.94</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.88</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.71</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.63</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.65</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.71</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.78</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.66</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.52</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.63</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.54</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.74</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.53</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.5</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.42</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.43</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.56</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.37</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.34</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.4</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.36</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.32</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.15</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.22</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.18</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.18</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.03</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.11</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.13</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.08</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.2</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.14</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.02</span><span class="p">,</span> <span class="mf">0.06</span><span class="p">,</span> <span class="mf">0.14</span><span class="p">,</span> <span class="mf">0.04</span><span class="p">,</span> <span class="mf">0.22</span><span class="p">,</span> <span class="mf">0.15</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.01</span><span class="p">,</span> <span class="mf">0.2</span><span class="p">,</span> <span class="mf">0.25</span><span class="p">,</span> <span class="mf">0.14</span><span class="p">,</span> <span class="mf">0.23</span><span class="p">,</span> <span class="mf">0.23</span><span class="p">,</span> <span class="mf">0.25</span><span class="p">,</span> <span class="mf">0.33</span><span class="p">,</span> <span class="mf">0.26</span><span class="p">,</span> <span class="mf">0.56</span><span class="p">,</span> <span class="mf">0.17</span><span class="p">,</span> <span class="mf">0.38</span><span class="p">,</span> <span class="mf">0.37</span><span class="p">,</span> <span class="mf">0.44</span><span class="p">,</span> <span class="mf">0.27</span><span class="p">,</span> <span class="mf">0.53</span><span class="p">,</span> <span class="mf">0.33</span><span class="p">,</span> <span class="mf">0.27</span><span class="p">,</span> <span class="mf">0.44</span><span class="p">,</span> <span class="mf">0.5</span><span class="p">,</span> <span class="mf">0.4</span><span class="p">,</span> <span class="mf">0.34</span><span class="p">,</span> <span class="mf">0.63</span><span class="p">,</span> <span class="mf">0.32</span><span class="p">,</span> <span class="mf">0.47</span><span class="p">,</span> <span class="mf">0.59</span><span class="p">,</span> <span class="mf">0.43</span><span class="p">,</span> <span class="mf">0.34</span><span class="p">,</span> <span class="mf">0.47</span><span class="p">,</span> <span class="mf">0.41</span><span class="p">,</span> <span class="mf">0.3</span><span class="p">,</span> <span class="mf">0.35</span><span class="p">,</span> <span class="mf">0.4</span><span class="p">,</span> <span class="mf">0.48</span><span class="p">,</span> <span class="mf">0.41</span><span class="p">,</span> <span class="mf">0.33</span><span class="p">,</span> <span class="mf">0.5</span><span class="p">,</span> <span class="mf">0.25</span><span class="p">,</span> <span class="mf">0.42</span><span class="p">,</span> <span class="mf">0.3</span><span class="p">,</span> <span class="mf">0.61</span><span class="p">,</span> <span class="mf">0.4</span><span class="p">,</span> <span class="mf">0.48</span><span class="p">]</span>
<span class="n">plt</span><span class="o">.</span><span class="n">scatter</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span>
</code></pre></div>
<p>This gives us:</p>
<p><img alt="Pelican" src="imgs/sympy-jacobian/data1.svg"></p>
<p>After that, we consider fitting a second derivative Gaussian to this noisy data with least squares optimization.</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">scipy</span> <span class="kn">import</span> <span class="n">optimize</span>
<span class="k">def</span> <span class="nf">second_derivative_gaussian</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">mu</span><span class="p">,</span> <span class="n">sigma</span><span class="p">):</span>
<span class="k">return</span> <span class="p">(</span><span class="mi">1</span> <span class="o">/</span> <span class="p">(</span><span class="n">sigma</span><span class="o">**</span><span class="mi">2</span><span class="p">))</span> <span class="o">*</span> <span class="p">(</span><span class="o">-</span><span class="n">sigma</span> <span class="o">**</span> <span class="mi">2</span> <span class="o">+</span> <span class="p">(</span><span class="n">x</span> <span class="o">-</span> <span class="n">mu</span><span class="p">)</span> <span class="o">**</span> <span class="mi">2</span><span class="p">)</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">exp</span><span class="p">(</span><span class="o">-</span><span class="p">(</span><span class="n">x</span> <span class="o">-</span> <span class="n">mu</span><span class="p">)</span> <span class="o">**</span> <span class="mi">2</span> <span class="o">/</span> <span class="p">(</span><span class="mi">2</span> <span class="o">*</span> <span class="n">sigma</span> <span class="o">**</span> <span class="mi">2</span><span class="p">))</span>
<span class="k">def</span> <span class="nf">f</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
<span class="k">return</span> <span class="n">y</span> <span class="o">-</span> <span class="n">second_derivative_gaussian</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="o">*</span><span class="n">p</span><span class="p">)</span>
<span class="n">p0</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="nb">len</span><span class="p">(</span><span class="n">x</span><span class="p">)</span><span class="o">/</span><span class="mi">2</span><span class="p">,</span> <span class="mf">30.</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">float</span><span class="p">)</span>
<span class="n">params_est</span><span class="p">,</span> <span class="n">_</span> <span class="o">=</span> <span class="n">optimize</span><span class="o">.</span><span class="n">leastsq</span><span class="p">(</span>
<span class="n">f</span><span class="p">,</span>
<span class="n">p0</span><span class="p">,</span>
<span class="n">args</span><span class="o">=</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">),</span>
<span class="n">full_output</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">params_est</span><span class="p">)</span>
</code></pre></div>
<p>Starting from an initial guess p0 = [mu0, sigma0] the optimizer will try to minimize the function f iteratively. The gradient descent will approximate the partial derivatives of f in a point [mu, sigma] (or in mathematical terms $\mu$, $\sigma$) using the classic formula :
$$lim_{h \rightarrow 0} \frac{d f(\mu, \sigma)}{d\mu} = \frac{f(\mu + h, \sigma) - f(\mu, \sigma)}{h}$$
In this equation, the partial derivative of $\mu$ is approximated in a specific point ($\mu, \sigma$), using two calls to f. Two calls is also needed to approximate the partial derivative of $\sigma$.</p>
<p>As f is defined analytically, we can compute offline all the partial derivatives of function f to compose the Jacobian matrix. For that we are going to use <a href="https://www.sympy.org/">sympy</a> and its automatic differenciation function.</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">sympy</span>
<span class="n">x</span> <span class="o">=</span> <span class="n">sym</span><span class="o">.</span><span class="n">Symbol</span><span class="p">(</span><span class="s1">'x'</span><span class="p">)</span>
<span class="n">mu</span> <span class="o">=</span> <span class="n">sym</span><span class="o">.</span><span class="n">Symbol</span><span class="p">(</span><span class="s1">'mu'</span><span class="p">)</span>
<span class="n">sigma</span> <span class="o">=</span> <span class="n">sym</span><span class="o">.</span><span class="n">Symbol</span><span class="p">(</span><span class="s1">'sigma'</span><span class="p">)</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">sym</span><span class="o">.</span><span class="n">Symbol</span><span class="p">(</span><span class="s1">'y'</span><span class="p">)</span>
<span class="c1"># Same function but we replaced 'np.exp' by 'sym.exp'</span>
<span class="k">def</span> <span class="nf">second_derivative_gaussian_sympy</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">mu</span><span class="p">,</span> <span class="n">sigma</span><span class="p">):</span>
<span class="k">return</span> <span class="p">(</span><span class="mi">1</span> <span class="o">/</span> <span class="p">(</span><span class="n">sigma</span><span class="o">**</span><span class="mi">2</span><span class="p">))</span> <span class="o">*</span> <span class="p">(</span><span class="o">-</span><span class="n">sigma</span> <span class="o">**</span> <span class="mi">2</span> <span class="o">+</span> <span class="p">(</span><span class="n">x</span> <span class="o">-</span> <span class="n">mu</span><span class="p">)</span> <span class="o">**</span> <span class="mi">2</span><span class="p">)</span> <span class="o">*</span> <span class="n">sym</span><span class="o">.</span><span class="n">exp</span><span class="p">(</span><span class="o">-</span><span class="p">(</span><span class="n">x</span> <span class="o">-</span> <span class="n">mu</span><span class="p">)</span> <span class="o">**</span> <span class="mi">2</span> <span class="o">/</span> <span class="p">(</span><span class="mi">2</span> <span class="o">*</span> <span class="n">sigma</span> <span class="o">**</span> <span class="mi">2</span><span class="p">))</span>
<span class="n">error</span> <span class="o">=</span> <span class="n">y</span> <span class="o">-</span> <span class="n">second_derivative_gaussian_sympy</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">mu</span><span class="p">,</span> <span class="n">sigma</span><span class="p">)</span>
<span class="k">for</span> <span class="n">var</span> <span class="ow">in</span> <span class="p">[</span><span class="n">mu</span><span class="p">,</span> <span class="n">sigma</span><span class="p">]:</span>
<span class="n">error_prime</span> <span class="o">=</span> <span class="n">error</span><span class="o">.</span><span class="n">diff</span><span class="p">(</span><span class="n">var</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">sym</span><span class="o">.</span><span class="n">simplify</span><span class="p">(</span><span class="n">error_prime</span><span class="p">))</span>
</code></pre></div>
<p>In this code, we declare the error function and we compute the derivatives for each variables (mu and sigma). Finally we print out the reduced results thanks to the simplify function of sympy : </p>
<div class="highlight"><pre><span></span><code><span class="c1"># (2*sigma**2*(-mu + x) - (mu - x)*(sigma**2 - (mu - x)**2))*exp(-(mu - x)**2/(2*sigma**2))/sigma**4</span>
<span class="c1"># 3*(mu - x)**2*exp(-(mu - x)**2/(2*sigma**2))/sigma**3 - (mu - x)**4*exp(-(mu - x)**2/(2*sigma**2))/sigma**5</span>
</code></pre></div>
<p>Now, we just obtained our two continuous partial derivatives of f in function of mu and sigma. We have to make a function f_prime able to return the derivatives : </p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">f_prime</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
<span class="n">mu</span><span class="p">,</span> <span class="n">sigma</span> <span class="o">=</span> <span class="n">p</span>
<span class="k">return</span> <span class="n">np</span><span class="o">.</span><span class="n">vstack</span><span class="p">([</span>
<span class="p">(</span><span class="mi">2</span><span class="o">*</span><span class="n">sigma</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="p">(</span><span class="o">-</span><span class="n">mu</span> <span class="o">+</span> <span class="n">x</span><span class="p">)</span> <span class="o">-</span> <span class="p">(</span><span class="n">mu</span> <span class="o">-</span> <span class="n">x</span><span class="p">)</span><span class="o">*</span><span class="p">(</span><span class="n">sigma</span><span class="o">**</span><span class="mi">2</span> <span class="o">-</span> <span class="p">(</span><span class="n">mu</span> <span class="o">-</span> <span class="n">x</span><span class="p">)</span><span class="o">**</span><span class="mi">2</span><span class="p">))</span><span class="o">*</span><span class="n">np</span><span class="o">.</span><span class="n">exp</span><span class="p">(</span><span class="o">-</span><span class="p">(</span><span class="n">mu</span> <span class="o">-</span> <span class="n">x</span><span class="p">)</span><span class="o">**</span><span class="mi">2</span><span class="o">/</span><span class="p">(</span><span class="mi">2</span><span class="o">*</span><span class="n">sigma</span><span class="o">**</span><span class="mi">2</span><span class="p">))</span><span class="o">/</span><span class="n">sigma</span><span class="o">**</span><span class="mi">4</span><span class="p">,</span>
<span class="mi">3</span><span class="o">*</span><span class="p">(</span><span class="n">mu</span> <span class="o">-</span> <span class="n">x</span><span class="p">)</span><span class="o">**</span><span class="mi">2</span><span class="o">*</span><span class="n">np</span><span class="o">.</span><span class="n">exp</span><span class="p">(</span><span class="o">-</span><span class="p">(</span><span class="n">mu</span> <span class="o">-</span> <span class="n">x</span><span class="p">)</span><span class="o">**</span><span class="mi">2</span><span class="o">/</span><span class="p">(</span><span class="mi">2</span><span class="o">*</span><span class="n">sigma</span><span class="o">**</span><span class="mi">2</span><span class="p">))</span><span class="o">/</span><span class="n">sigma</span><span class="o">**</span><span class="mi">3</span> <span class="o">-</span> <span class="p">(</span><span class="n">mu</span> <span class="o">-</span> <span class="n">x</span><span class="p">)</span><span class="o">**</span><span class="mi">4</span><span class="o">*</span><span class="n">np</span><span class="o">.</span><span class="n">exp</span><span class="p">(</span><span class="o">-</span><span class="p">(</span><span class="n">mu</span> <span class="o">-</span> <span class="n">x</span><span class="p">)</span><span class="o">**</span><span class="mi">2</span><span class="o">/</span><span class="p">(</span><span class="mi">2</span><span class="o">*</span><span class="n">sigma</span><span class="o">**</span><span class="mi">2</span><span class="p">))</span><span class="o">/</span><span class="n">sigma</span><span class="o">**</span><span class="mi">5</span>
<span class="p">])</span><span class="o">.</span><span class="n">T</span>
</code></pre></div>
<p>After that, the least squares optimization use the function f_prime instead of approximating the gradient at each step: </p>
<div class="highlight"><pre><span></span><code><span class="n">params_est</span><span class="p">,</span> <span class="n">_</span> <span class="o">=</span> <span class="n">optimize</span><span class="o">.</span><span class="n">leastsq</span><span class="p">(</span>
<span class="n">f</span><span class="p">,</span>
<span class="n">p0</span><span class="p">,</span>
<span class="n">Dfun</span><span class="o">=</span><span class="n">f_prime</span><span class="p">,</span> <span class="c1"># The change is here</span>
<span class="n">args</span><span class="o">=</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">),</span>
<span class="n">full_output</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">params_est</span><span class="p">)</span>
<span class="c1"># [200.84118964 50.02549059]</span>
</code></pre></div>
<p>In this optimization, one call to the function f_prime replaces four calls to the function f. At the end of the process we get [mu, sigma] = [200.841, 50.025]. To show the fitting result, on the next figure, we display a second derivative Gaussian centered on 200.841 with a standard deviation of 50.025.</p>
<p><img alt="Pelican" src="imgs/sympy-jacobian/data2.svg"></p>
<p>With many parameters, computing the Jacobian matrix can be useful to speed up the optimization.</p>How to make pickle faster?2017-01-05T00:00:00+01:002017-01-05T00:00:00+01:00Victor MARTINtag:vmartin.fr,2017-01-05:/how-to-make-pickle-faster.html<p>When I used the <a href="https://docs.python.org/2/library/pickle.html">pickle module</a> in python in a project to save an object, quickly, as the object began to grow bigger and bigger during the development, loading and saving became a bottleneck in my code.</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">pickle</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
<span class="n">a</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">ones</span><span class="p">((</span><span class="mi">1000</span><span class="p">,</span> <span class="mi">1000</span><span class="p">),</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np …</span></code></pre></div><p>When I used the <a href="https://docs.python.org/2/library/pickle.html">pickle module</a> in python in a project to save an object, quickly, as the object began to grow bigger and bigger during the development, loading and saving became a bottleneck in my code.</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">pickle</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
<span class="n">a</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">ones</span><span class="p">((</span><span class="mi">1000</span><span class="p">,</span> <span class="mi">1000</span><span class="p">),</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">float</span><span class="p">)</span>
<span class="n">file</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s1">'file.dat'</span><span class="p">,</span> <span class="s1">'wb'</span><span class="p">)</span>
<span class="n">pickle</span><span class="o">.</span><span class="n">dump</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">file</span><span class="p">)</span>
<span class="n">file</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
<span class="c1"># Time : 2.5 s to dump</span>
<span class="c1"># Size : 28Mo</span>
<span class="n">file</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s1">'file.dat'</span><span class="p">,</span> <span class="s1">'rb'</span><span class="p">)</span>
<span class="n">b</span> <span class="o">=</span> <span class="n">pickle</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">file</span><span class="p">)</span>
<span class="n">file</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
<span class="c1"># Time : 30 ms to load</span>
</code></pre></div>
<p>I found 3 little tricks to speed up that a little bit :</p>
<ul>
<li><a href="#1-use-higher-protocol-version-protocol">1. Use higher protocol version {#protocol}</a></li>
<li><a href="#2-use-cpickle-instead-of-pickle-cpickle">2. Use cPickle instead of pickle {#cpickle}</a></li>
<li><a href="#3-disable-the-garbage-collector-gc">3. Disable the garbage collector {#gc}</a></li>
</ul>
<h2 id="protocol">1. Use higher protocol version</h2>
<p>By default, pickle use the lowest protocol for serializing and writing objects in file. Hence, files are writed in ASCII mode which is slow and create voluminous files. To write in binary mode and use the highest protocol version available, we just have to specify the parameter protocol to -1.</p>
<div class="highlight"><pre><span></span><code><span class="n">file</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s1">'file.dat'</span><span class="p">,</span> <span class="s1">'wb'</span><span class="p">)</span>
<span class="n">pickle</span><span class="o">.</span><span class="n">dump</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">file</span><span class="p">,</span> <span class="n">protocol</span><span class="o">=-</span><span class="mi">1</span><span class="p">)</span>
<span class="n">file</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
<span class="c1"># Time : 24 ms to dump</span>
<span class="c1"># Size : 7.7Mo</span>
<span class="n">file</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s1">'file.dat'</span><span class="p">,</span> <span class="s1">'rb'</span><span class="p">)</span>
<span class="n">b</span> <span class="o">=</span> <span class="n">pickle</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">file</span><span class="p">)</span>
<span class="n">file</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
<span class="c1"># Time : 1O ms to load</span>
</code></pre></div>
<h2 id="cpickle">2. Use cPickle instead of pickle</h2>
<p>A simple tweak to make it a little bit faster is to use cPickle instead of pickle. cPickle is exactly the same module as pickle - same functions, same parameters, same operations - but it's written in C, which permits us to gain that extra bit of speed.</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">cPickle</span>
<span class="n">file</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s1">'file.dat'</span><span class="p">,</span> <span class="s1">'wb'</span><span class="p">)</span>
<span class="n">cPickle</span><span class="o">.</span><span class="n">dump</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">file</span><span class="p">,</span> <span class="n">protocol</span><span class="o">=-</span><span class="mi">1</span><span class="p">)</span>
<span class="n">file</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
<span class="c1"># Time : 18 ms to dump</span>
<span class="c1"># Size : 7.7Mo</span>
<span class="n">file</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s1">'file.dat'</span><span class="p">,</span> <span class="s1">'rb'</span><span class="p">)</span>
<span class="n">b</span> <span class="o">=</span> <span class="n">cPickle</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">file</span><span class="p">)</span>
<span class="n">file</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
<span class="c1"># Time : 10 ms to load</span>
</code></pre></div>
<p>Here, the gain of speed is completely negligible but in certain cases, like in a deep hierarchy with objects in objects in objects, the gain can be substantial.</p>
<h2 id="gc">3. Disable the garbage collector</h2>
<p>Finally, if we have a lot of objects with references to other objects, the garbage collector can slow down the process; disabling it can improve performance.</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">cPickle</span>
<span class="kn">import</span> <span class="nn">gc</span>
<span class="n">gc</span><span class="o">.</span><span class="n">disable</span><span class="p">()</span>
<span class="n">file</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s1">'file.dat'</span><span class="p">,</span> <span class="s1">'wb'</span><span class="p">)</span>
<span class="n">cPickle</span><span class="o">.</span><span class="n">dump</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">file</span><span class="p">,</span> <span class="n">protocol</span><span class="o">=-</span><span class="mi">1</span><span class="p">)</span>
<span class="n">file</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
<span class="c1"># Time : 18 ms to dump</span>
<span class="c1"># Size : 7.7Mo</span>
<span class="n">file</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s1">'file.dat'</span><span class="p">,</span> <span class="s1">'rb'</span><span class="p">)</span>
<span class="n">b</span> <span class="o">=</span> <span class="n">cPickle</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">file</span><span class="p">)</span>
<span class="n">file</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
<span class="n">gc</span><span class="o">.</span><span class="n">enable</span><span class="p">()</span>
</code></pre></div>
<p>Here, we haven't got any improvement because of the simplicity of our example.</p>
<p>If pickle/cPickle is still too slow after that, you can try other libraries like <a href="https://docs.python.org/2/library/marshal.html">marshal</a> or <a href="https://docs.python.org/2/library/json.html">JSON</a>, but you will lose the big advantage of being able to serialize any object wih arbitrary structure without effort.</p>