nm1

import%20marimo%0A%0A__generated_with%20%3D%20%220.11.20%22%0Aapp%20%3D%20marimo.App(width%3D%22medium%22)%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_()%3A%0A%20%20%20%20%23%20Relevant%20Imports%0A%0A%20%20%20%20import%20marimo%20as%20mo%0A%20%20%20%20import%20matplotlib.pyplot%20as%20plt%0A%20%20%20%20from%20matplotlib%20import%20animation%0A%20%20%20%20import%20plotly.graph_objects%20as%20go%0A%20%20%20%20import%20numpy%20as%20np%0A%20%20%20%20import%20sympy%20as%20sm%0A%0A%20%20%20%20import%20os%0A%0A%20%20%20%20try%3A%0A%20%20%20%20%20%20%20%20os.chdir(%22assets%2Farticles%2Fnotebooks%22)%0A%20%20%20%20except%3A%0A%20%20%20%20%20%20%20%20pass%0A%0A%0A%20%20%20%20def%20display_iframe(path%3Astr)%3A%0A%20%20%20%20%20%20%20%20%23%20Read%20the%20saved%20Plotly%20HTML%20file%0A%20%20%20%20%20%20%20%20with%20open(path%2C%20%22r%22)%20as%20f%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20html_content%20%3D%20f.read()%0A%0A%20%20%20%20%20%20%20%20%23%20Display%20it%20in%20Jupyter%20Notebook%0A%20%20%20%20%20%20%20%20return%20mo.iframe(html_content%2Cheight%3D'500px')%0A%20%20%20%20return%20animation%2C%20display_iframe%2C%20go%2C%20mo%2C%20np%2C%20os%2C%20plt%2C%20sm%0A%0A%0A%40app.cell%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(%0A%20%20%20%20%20%20%20%20r%22%22%22%0A%20%20%20%20%20%20%20%20%23%20Optimization%2C%20Newton's%20Method%2C%20%26%20Profit%20Maximization%3A%20Part%201%20-%20Basic%20Optimization%20Theory%0A%20%20%20%20%20%20%20%20%3Ccenter%3E%20**Learn%20how%20to%20solve%20and%20utilize%20Newton's%20Method%20for%20multi-dimensional%20optimization%20problems**%20%3C%2Fcenter%3E%0A%20%20%20%20%20%20%20%20%22%22%22%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(%0A%20%20%20%20%20%20%20%20r%22%22%22%0A%20%20%20%20%20%20%20%20%23%23%20Introduction%0A%0A%20%20%20%20%20%20%20%20%3E%20This%20article%20is%20the%20**1st**%20in%20a%203%20part%20series.%20In%20the%201st%20part%2C%20we%20will%20be%20studying%20basic%20optimization%20theory.%20Then%2C%20in%20%3Ca%20href%3D%22%2Farticles%2Fnm2%22%20target%3D%22_blank%22%20rel%3D%22noopener%20noreferrer%22%3Ept.%20%202%3C%2Fa%3E%2C%20we%20will%20be%20extending%20this%20theory%20to%20constrained%20optimization%20problems.%20Lastly%2C%20in%20%3Ca%20href%3D%22%2Farticles%2Fnm3%22%20target%3D%22_blank%22%20rel%3D%22noopener%20noreferrer%22%3Ept.%20%203%3C%2Fa%3E%2C%20we%20will%20apply%20the%20optimization%20theory%20covered%20to%20solve%20a%20simple%20profit%20maximization%20problem.%0A%0A%20%20%20%20%20%20%20%20Mathematical%20optimization%20is%20an%20extremely%20powerful%20field%20of%20mathematics%20the%20underpins%20much%20of%20what%20we%2C%20as%20data%20scientists%2C%20implicitly%2C%20or%20explicitly%2C%20utilize%20on%20a%20regular%20basis%20%E2%80%94%20in%20fact%2C%20nearly%20all%20machine%20learning%20algorithms%20make%20use%20of%20optimization%20theory%20to%20obtain%20model%20convergence.%20Take%2C%20for%20example%2C%20a%20classification%20problem%2C%20we%20seek%20to%20minimize%20log-loss%20by%20choosing%20the%20optimal%20parameters%20or%20weights%20of%20the%20model.%20In%20general%2C%20mathematical%20optimization%20can%20be%20thought%20of%20as%20the%20primary%20theoretical%20mechanism%20by%20which%20machines%20learn.%20A%20robust%20understanding%20of%20mathematical%20optimization%20is%20an%20extremely%20beneficial%20skillset%20to%20have%20in%20the%20data%20scientists%20toolbox%20%E2%80%94%20it%20enables%20the%20data%20scientist%20to%20have%20a%20deeper%20understanding%20of%20many%20of%20the%20algorithms%20used%20today%20and%2C%20furthermore%2C%20to%20solve%20a%20vast%20array%20of%20unique%20optimization%20problems.%0A%0A%20%20%20%20%20%20%20%20Many%20of%20the%20readers%20may%20be%20familiar%20with%20gradient%20descent%2C%20or%20related%20optimization%20algorithms%20such%20as%20stochastic%20gradient%20descent.%20However%2C%20this%20post%20will%20discuss%20in%20more%20depth%20the%20classical%20Newton%20method%20for%20optimization%2C%20sometimes%20referred%20to%20as%20the%20Newton-Raphson%20method.%20Note%20that%20gradient%20descent%2C%20and%20it's%20various%20flavors%2C%20are%20overwhelmingly%20leveraged%20for%20many%20ML%2FAI%20algorithms%20due%20to%20it's%20efficiency%20%26%20computationally%20tractability.%20We%20will%2C%20nevertheless%2C%20develop%20the%20mathematics%20behind%20optimization%20theory%20from%20the%20basics%20to%20gradient%20descent%20and%20then%20dive%20more%20into%20Newton%E2%80%99s%20method%20with%20implementations%20in%20python.%20This%20will%20serve%20as%20the%20necessary%20preliminaries%20for%20our%20excursion%20into%20constrained%20optimization%20in%20part%202%20and%20an%20econometric%20profit-maximization%20problem%20in%20part%203%20of%20this%20series.%0A%0A%20%20%20%20%20%20%20%20%23%23%20Optimization%20Basics%20-%20A%20Simple%20Quadratic%20Function%0A%0A%20%20%20%20%20%20%20%20Mathematical%20optimization%20can%20be%20defined%20%E2%80%9Cas%20the%20science%20of%20determining%20the%20best%20solutions%20to%20mathematically%20defined%20problems.%E2%80%9D%5B1%5D%20This%20may%20be%20conceptualized%20in%20some%20real-world%20examples%20as%3A%20choosing%20the%20parameters%20to%20minimize%20a%20loss%20function%20for%20a%20machine%20learning%20algorithm%2C%20choosing%20price%20and%20advertising%20to%20maximize%20profit%2C%20choosing%20stocks%20to%20maximize%20risk-adjusted%20financial%20return%2C%20etc.%20Formally%2C%20any%20mathematical%20optimization%20problem%20can%20be%20formulated%20abstractly%20as%20such%3A%0A%0A%20%20%20%20%20%20%20%20%24%24%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bequation%7D%0A%20%20%20%20%20%20%20%20%5Cbegin%7Baligned%7D%0A%20%20%20%20%20%20%20%20%5Cmin_%7B%5Cmathbf%7Bx%7D%7D%20%5Cquad%26%20f(%5Cmathbf%7Bx%7D)%2C%20%5Cmathbf%7Bx%7D%3D%5Bx_1%2Cx_2%2C%5Cdots%2Cx_n%5D%5ET%20%5Cin%20%5Cmathbb%7BR%7D%5En%20%5C%5C%0A%20%20%20%20%20%20%20%20%5Ctext%7Bsubject%20to%7D%20%5Cquad%20%26%20g_j(%5Cmathbf%7Bx%7D)%20%5Cle%200%2C%20j%3D1%2C2%2C%5Cdots%2Cm%20%5C%5C%0A%20%20%20%20%20%20%20%20%26%20h_j(%5Cmathbf%7Bx%7D)%20%3D%200%2C%20j%3D1%2C2%2C%5Cdots%2Cr%20%0A%20%20%20%20%20%20%20%20%5Cend%7Baligned%7D%0A%20%20%20%20%20%20%20%20%5Ctag%7B1%7D%0A%20%20%20%20%20%20%20%20%5Cend%7Bequation%7D%0A%20%20%20%20%20%20%20%20%24%24%0A%0A%20%20%20%20%20%20%20%20This%20can%20be%20read%20as%20follows%3A%20Choose%20real%20values%20of%20the%20vector%20%24%5Cmathbf%7Bx%7D%24%20that%20minimize%20the%20objective%20function%20%24f(x)%24%20(or%20maximize%20%24-f(x)%24)%20subject%20to%20the%20inequality%20constraints%20%24g(x)%24%20and%20equality%20constraints%20%24h(x)%24.%20We%20will%20be%20addressing%20how%20to%20solve%20for%20constrained%20optimization%20problems%20in%20part%202%20of%20this%20series%20%E2%80%94%20as%20they%20can%20make%20the%20optimization%20problems%20particularly%20non-trivial.%20For%20now%2C%20let%E2%80%99s%20look%20at%20an%20unconstrained%20single%20variable%20example%20%E2%80%94%20consider%20the%20following%20optimization%20problem%3A%0A%0A%20%20%20%20%20%20%20%20%24%24%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bequation%7D%0A%20%20%20%20%20%20%20%20%5Cmin_x%203x%5E2%2B2x-24%0A%20%20%20%20%20%20%20%20%5Ctag%7B2%7D%0A%20%20%20%20%20%20%20%20%5Cend%7Bequation%7D%0A%20%20%20%20%20%20%20%20%24%24%0A%0A%20%20%20%20%20%20%20%20In%20this%20case%2C%20we%20want%20to%20choose%20the%20value%20of%20%24x%24%20that%20minimizes%20the%20above%20quadratic%20function.%20There%20are%20multiple%20ways%20we%20can%20go%20about%20this%20%E2%80%94%20first%2C%20a%20na%C3%AFve%20approach%20would%20be%20to%20do%20a%20grid%20search%20iterating%20over%20a%20large%20range%20of%20%0A%20%20%20%20%20%20%20%20%24x%24%20values%20and%20choose%20%24x%24%20where%20%24f(x)%24%20has%20the%20lowest%20functional%20value.%20However%2C%20this%20approach%20can%20quickly%20lose%20computational%20tractability%20as%20the%20search%20space%20increases%2C%20the%20function%20becomes%20more%20complex%2C%20or%20the%20dimensions%20increase.%0A%0A%20%20%20%20%20%20%20%20Alternatively%2C%20we%20can%20solve%20directly%20using%20calculus%20if%20a%20closed-form%20solution%20exists.%20That%20is%2C%20we%20can%20solve%20analytically%20for%20the%20value%20of%20%24x%24.%20By%20taking%20the%20derivative%20(or%2C%20as%20covered%20later%2C%20the%20gradient%20in%20higher%20dimensions)%20and%20setting%20it%20equal%20to%200%20%E2%80%94%20the%20first%20order%20necessary%20condition%20for%20a%20relative%20minimum%20%E2%80%94%20we%20can%20solve%20for%20the%20relative%20extrema%20of%20the%20function.%20We%20can%20then%20take%20the%20second%20derivate%20(or%2C%20as%20covered%20later%2C%20the%20Hessian%20in%20higher%20dimensions)%20to%20determine%20whether%20this%20extrema%20is%20a%20maximum%20or%20minimum.%20A%20second%20derivative%20greater%20than%200%20(or%20a%20positive%20definite%20Hessian)%20%E2%80%94%20the%20second%20order%20necessary%20condition%20for%20a%20relative%20minimum%20%E2%80%94%20implies%20a%20minimum%20and%20vice-versa.%20Observe%3A%0A%0A%20%20%20%20%20%20%20%20%24%24%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bequation%7D%0A%20%20%20%20%20%20%20%20%5Cbegin%7Baligned%7D%0A%20%20%20%20%20%20%20%20%26%5Cfrac%7Bd%7D%7Bdx%7D(3x%5E2%2B2x-24)%3D0%20%5CRightarrow%206x%2B2%3D0%20%5CRightarrow%20x%5E*%3D-%5Cfrac%7B1%7D%7B3%7D%20%5C%5C%0A%20%20%20%20%20%20%20%20%26%5Cfrac%7Bd%5E2%7D%7Bdx%5E2%7D(3x%5E2%2B2x-24)%3D6%20%3E%200%20%5CRightarrow%20%5Ctext%7Bminimum%7D%0A%20%20%20%20%20%20%20%20%5Cend%7Baligned%7D%0A%20%20%20%20%20%20%20%20%5Ctag%7B3%7D%0A%20%20%20%20%20%20%20%20%5Cend%7Bequation%7D%0A%20%20%20%20%20%20%20%20%24%24%0A%0A%20%20%20%20%20%20%20%20We%20can%20verify%20this%20graphically%20for%20(2)%20above%3A%0A%20%20%20%20%20%20%20%20%22%22%22%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo%2C%20np%2C%20plt)%3A%0A%20%20%20%20def%20parabola_viz()%3A%0A%20%20%20%20%20%20%20%20x%20%3D%20np.linspace(-5.5%2C%205%2C%20100)%0A%20%20%20%20%20%20%20%20y%20%3D%203%20*%20x**2%20%2B%202%20*%20x%20-%2024%0A%0A%20%20%20%20%20%20%20%20%23%20setting%20the%20axes%20at%20the%20centre%0A%20%20%20%20%20%20%20%20fig%20%3D%20plt.figure()%0A%20%20%20%20%20%20%20%20ax%20%3D%20fig.add_subplot(1%2C%201%2C%201)%0A%20%20%20%20%20%20%20%20%23%20ax.spines%5B'left'%5D.set_position('zero')%0A%20%20%20%20%20%20%20%20ax.spines%5B%22bottom%22%5D.set_position(%22zero%22)%0A%20%20%20%20%20%20%20%20ax.spines%5B%22right%22%5D.set_color(%22none%22)%0A%20%20%20%20%20%20%20%20ax.spines%5B%22top%22%5D.set_color(%22none%22)%0A%20%20%20%20%20%20%20%20ax.xaxis.set_ticks_position(%22bottom%22)%0A%20%20%20%20%20%20%20%20%23%20ax.yaxis.set_ticks_position('right')%0A%20%20%20%20%20%20%20%20ax.set_yticks(%5B60%2C%2045%2C%2030%2C%2015%2C%200%2C%20-15%5D)%0A%20%20%20%20%20%20%20%20ax.set_xticks(%5B-4%2C%20-2%2C%200%2C%202%2C%204%5D)%0A%20%20%20%20%20%20%20%20ax.text(x%3D-1.75%2C%20y%3D62%2C%20s%3D%22x%3D-1%2F3%22)%0A%20%20%20%20%20%20%20%20ax.axvline(x%3D-1%20%2F%203%2C%20linestyle%3D%22%3A%22%2C%20color%3D%22black%22)%0A%0A%20%20%20%20%20%20%20%20%23%20plot%20the%20function%0A%20%20%20%20%20%20%20%20plt.scatter(-1%20%2F%203%2C%203%20*%20(-1%20%2F%203)%20**%202%20%2B%202%20*%20(-1%20%2F%203)%20-%2024%2C%20c%3D%22black%22)%0A%20%20%20%20%20%20%20%20plt.plot(x%2C%20y%2C%20%22r%22)%0A%0A%20%20%20%20%20%20%20%20plt.savefig(%22data%2Fparabola_viz.webp%22%2C%20format%3D%22webp%22%2C%20dpi%3D300%2C%20bbox_inches%3D'tight')%0A%0A%20%20%20%20parabola_viz()%0A%20%20%20%20mo.image(%22data%2Fparabola_viz.webp%22%2C%20height%3D500).center()%0A%20%20%20%20return%20(parabola_viz%2C)%0A%0A%0A%40app.cell%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(%0A%20%20%20%20%20%20%20%20r%22%22%22%0A%20%20%20%20%20%20%20%207Note%20that%20when%20multiple%20extrema%20of%20a%20function%20exist%20(i.e.%2C%20multiple%20minimums%20or%20maximums)%2C%20care%20must%20be%20taken%20to%20determine%20which%20is%20the%20global%20extrema%20%E2%80%94%20we%20will%20briefly%20discuss%20this%20issue%20further%20in%20this%20article.%0A%0A%20%20%20%20%20%20%20%20The%20analytical%20approach%20demonstrated%20above%20can%20be%20extended%20into%20higher%20dimensions%20utilizing%20gradients%20and%20Hessians%20%E2%80%94%20however%2C%20we%20will%20not%20be%20solving%20the%20closed-form%20solutions%20in%20higher%20dimensions%20%E2%80%94%20the%20intuition%2C%20however%2C%20remains%20the%20same.%20We%20will%2C%20nevertheless%2C%20be%20solving%20higher%20dimensional%20problems%20utilizing%20_iterative%20schemes_.%20What%20do%20I%20mean%20by%20_iterative%20schemes_%3F%20In%20general%2C%20a%20closed%20form%20(or%20analytical)%20solution%20may%20not%20exist%2C%20and%20certainly%20need%20not%20exist%20for%20a%20maximum%20or%20minimum%20to%20exist.%20Thus%2C%20we%20require%20a%20methodology%20to%20numerically%20solve%20the%20optimization%20problem.%20This%20leads%20us%20to%20the%20more%20generalized%20iterative%20schemes%20including%20gradient%20descent%20and%20the%20Newton%20methods.%0A%0A%20%20%20%20%20%20%20%20%23%23%20Iterative%20Optimization%20Schemes%0A%0A%20%20%20%20%20%20%20%20In%20general%2C%20there%20are%20three%20main%20categories%20of%20iterative%20optimization%20schemes.%20Namely%2C%20_zero-order_%2C%20_first-order_%2C%20and%20_second-order_%2C%20which%20make%20use%20of%20local%20information%20about%20the%20function%20from%20no%20derivatives%2C%20first%20derivatives%2C%20or%20second%20derivatives%2C%20respectively.%5B1%5D%20In%20order%20to%20use%20each%20iterative%20scheme%2C%20the%20function%20%24f(x)%24%20must%20be%20a%20continuous%20%26%20differentiable%20function%20to%20the%20respective%20degree.%0A%0A%20%20%20%20%20%20%20%20%23%23%23%20Zero-order%20Iterative%20Schemes%0A%0A%20%20%20%20%20%20%20%20_Zero-order%20iterative%20schemes_%20are%20closely%20aligned%20with%20the%20grid-search%20as%20mentioned%20above%20%E2%80%94%20simply%2C%20you%20search%20over%20a%20certain%20range%20possible%20values%20of%20the%20value%20%24%5Cmathbf%7Bx%7D%24%20to%20obtain%20the%20minimum%20functional%20value.%20As%20you%20likely%20suspect%2C%20these%20methods%20tend%20to%20be%20much%20more%20computationally%20expensive%20than%20methods%20that%20utilize%20higher%20orders.%20Needless%20to%20say%2C%20they%20can%20be%20reliable%20and%20easy%20to%20program.%20There%20are%20methodologies%20out%20there%20that%20improve%20upon%20the%20simple%20grid-search%2C%20see%20%5B1%5D%20for%20more%20information%3B%20however%2C%20we%20will%20be%20focusing%20more-so%20on%20the%20higher-order%20schemes.%0A%0A%20%20%20%20%20%20%20%20%23%23%23%20First-order%20Iterative%20Schemes%0A%0A%20%20%20%20%20%20%20%20_First-order%20iterative%20schemes_%20are%20iterative%20schemes%20that%20utilize%20local%20information%20of%20the%20first%20derivatives%20of%20the%20objective%20function.%20Most%20notably%2C%20gradient%20descent%20methods%20fall%20under%20this%20category.%20For%20a%20single%20variable%20function%20as%20above%2C%20the%20gradient%20is%20just%20the%20first%20derivative.%20Generalizing%20this%20to%20%24n%24%20dimensions%2C%20for%20a%20function%20%24f(x)%24%2C%20the%20gradient%20is%20the%20vector%20of%20first%20order%20partial%20derivatives%3A%0A%0A%20%20%20%20%20%20%20%20%24%24%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bequation%7D%0A%20%20%20%20%20%20%20%20%5Cnabla%20f(%5Cmathbf%7Bx%7D)%3D%20%5Cbegin%7Bbmatrix%7D%0A%20%20%20%20%20%20%20%20%5Cfrac%7B%5Cpartial%20f%7D%7B%5Cpartial%20x_1%7D%20%5C%5C%5B6pt%5D%0A%20%20%20%20%20%20%20%20%5Cfrac%7B%5Cpartial%20f%7D%7B%5Cpartial%20x_2%7D%20%5C%5C%5B6pt%5D%0A%20%20%20%20%20%20%20%20%5Cvdots%20%5C%5C%5B6pt%5D%0A%20%20%20%20%20%20%20%20%5Cfrac%7B%5Cpartial%20f%7D%7B%5Cpartial%20x_n%7D%20%0A%20%20%20%20%20%20%20%20%5Cend%7Bbmatrix%7D%0A%20%20%20%20%20%20%20%20%5Ctag%7B4%7D%0A%20%20%20%20%20%20%20%20%5Cend%7Bequation%7D%0A%20%20%20%20%20%20%20%20%24%24%0A%0A%20%20%20%20%20%20%20%20Gradient%20descent%20begins%20by%20choosing%20a%20random%20starting%20point%20and%20iteratively%20taking%20steps%20in%20the%20direction%20of%20the%20negative%20gradient%20of%20%24f(%5Cmathbf%7Bx%7D)%24%20%E2%80%94%20the%20steepest%20direction%20of%20the%20function.%20Each%20iterative%20step%20can%20be%20represented%20as%20follows%3A%0A%0A%20%20%20%20%20%20%20%20%24%24%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bequation%7D%0A%20%20%20%20%20%20%20%20%5Cmathbf%7Bx%7D_%7Bk%2B1%7D%3D%5Cmathbf%7Bx%7D_k-%5Cgamma%20%5Cnabla%20f(%5Cmathbf%7Bx%7D_k)%0A%20%20%20%20%20%20%20%20%5Ctag%7B5%7D%0A%20%20%20%20%20%20%20%20%5Cend%7Bequation%7D%0A%20%20%20%20%20%20%20%20%24%24%0A%0A%20%20%20%20%20%20%20%20where%20%24%5Cgamma%24%20is%20the%20respective%20learning%20rate%2C%20which%20controls%20how%20fast%20or%20slow%20the%20gradient%20descent%20algorithm%20%E2%80%9Clearns%E2%80%9D%20at%20each%20iteration.%20Too%20large%20and%20our%20iterations%20can%20diverge%20uncontrollably.%20Too%20small%20and%20the%20iterations%20can%20take%20forever%20to%20converge.%20This%20scheme%20is%20conducted%20iteratively%20until%20any%20one%20or%20more%20convergence%20criteria%20is%20achieved%2C%20such%20as%3A%0A%0A%20%20%20%20%20%20%20%20%24%24%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bequation%7D%0A%20%20%20%20%20%20%20%20%5Cbegin%7Baligned%7D%0A%20%20%20%20%20%20%20%20%26%5Ctext%7BCriteria%201%3A%20%7D%20%5ClVert%20%5Cmathbf%7Bx%7D_k%20-%20%5Cmathbf%7Bx%7D_%7Bk-1%7D%20%5CrVert%20%3C%20%5Cepsilon_1%20%5C%5C%5B6pt%5D%0A%20%20%20%20%20%20%20%20%26%5Ctext%7BCriteria%202%3A%20%7D%20%5Clvert%20f(%5Cmathbf%7Bx%7D_k)%20-%20f(%5Cmathbf%7Bx%7D_%7Bk-1%7D)%20%5Crvert%20%3C%20%5Cepsilon_2%0A%20%20%20%20%20%20%20%20%5Cend%7Baligned%7D%0A%20%20%20%20%20%20%20%20%5Ctag%7B6%7D%0A%20%20%20%20%20%20%20%20%5Cend%7Bequation%7D%0A%20%20%20%20%20%20%20%20%24%24%0A%0A%20%20%20%20%20%20%20%20for%20some%20small%20epsilon%20threshold.%20Referring%20back%20to%20our%20quadratic%20example%2C%20setting%20our%20initial%20guess%20to%20%24x%20%3D%203%24%20and%20the%20learning%20rate%20%24%5Cgamma%20%3D%200.1%24%2C%20the%20steps%20would%20look%20as%20follows%3A%0A%0A%20%20%20%20%20%20%20%20%24%24%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bequation%7D%0A%20%20%20%20%20%20%20%20%5Cbegin%7Baligned%7D%0A%20%20%20%20%20%20%20%20%26%20x_0%20%3D%203%2C%20%5Cquad%20%5Cgamma%20%3D%200.1%2C%20%5Cquad%20%5Cnabla%20f(x)%20%3D%20%5Cfrac%7Bd%7D%7Bdx%7D%20f(x)%20%3D%206x%20%2B%202%20%5C%5C%5B8pt%5D%0A%20%20%20%20%20%20%20%20%26%20x_1%20%3D%20x_0%20-%20%5Cgamma%20%5Cnabla%20f(x_0)%20%3D%203%20-%200.1(6%20%5Ctimes%203%20%2B%202)%20%3D%201%20%5C%5C%5B6pt%5D%0A%20%20%20%20%20%20%20%20%26%20x_2%20%3D%20x_1%20-%20%5Cgamma%20%5Cnabla%20f(x_1)%20%3D%201%20-%200.1(6%20%5Ctimes%201%20%2B%202)%20%3D%200.2%20%5C%5C%5B6pt%5D%0A%20%20%20%20%20%20%20%20%26%20x_3%20%3D%20x_2%20-%20%5Cgamma%20%5Cnabla%20f(x_2)%20%3D%200.2%20-%200.1(6%20%5Ctimes%200.2%20%2B%202)%20%3D%20-0.12%20%5C%5C%5B6pt%5D%0A%20%20%20%20%20%20%20%20%26%20%5Cvdots%20%5C%5C%5B6pt%5D%0A%20%20%20%20%20%20%20%20%26%20x%5E*%20%3D%20x_n%20%5Capprox%20-0.33%20%5Capprox%20-%5Cfrac%7B1%7D%7B3%7D%0A%20%20%20%20%20%20%20%20%5Cend%7Baligned%7D%0A%20%20%20%20%20%20%20%20%5Ctag%7B7%7D%0A%20%20%20%20%20%20%20%20%5Cend%7Bequation%7D%0A%20%20%20%20%20%20%20%20%24%24%0A%0A%20%20%20%20%20%20%20%20And%20visually%2C%20with%20algorithmic%20output%20(functions%20will%20be%20discussed%20later%20in%20article)%3A%0A%20%20%20%20%20%20%20%20%22%22%22%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(animation%2C%20gradient_descent%2C%20mo%2C%20np%2C%20plt%2C%20sm)%3A%0A%20%20%20%20def%20gd_visual()%3A%0A%20%20%20%20%20%20%20%20%23%20Gradient%20Descent%0A%20%20%20%20%20%20%20%20x%20%3D%20sm.symbols(%22x%22)%0A%0A%20%20%20%20%20%20%20%20objective%20%3D%203%20*%20x**2%20%2B%202%20*%20x%20-%2024%0A%20%20%20%20%20%20%20%20symbols%20%3D%20%5Bx%5D%0A%20%20%20%20%20%20%20%20x0%20%3D%20%7Bx%3A%203%7D%0A%0A%20%20%20%20%20%20%20%20_%2C%20x_iterations%20%3D%20gradient_descent(objective%2C%20symbols%2C%20x0%2C%20iterations%3D20)%0A%0A%20%20%20%20%20%20%20%20%23%20Defining%20surface%20and%20axes%0A%20%20%20%20%20%20%20%20x%20%3D%20np.linspace(-5.5%2C%205%2C%20100)%0A%20%20%20%20%20%20%20%20y%20%3D%203%20*%20x**2%20%2B%202%20*%20x%20-%2024%0A%0A%20%20%20%20%20%20%20%20%23%20setting%20the%20axes%20at%20the%20centre%0A%20%20%20%20%20%20%20%20fig%20%3D%20plt.figure()%0A%20%20%20%20%20%20%20%20ax%20%3D%20fig.add_subplot(1%2C%201%2C%201)%0A%20%20%20%20%20%20%20%20%23%20ax.spines%5B'left'%5D.set_position('zero')%0A%20%20%20%20%20%20%20%20ax.spines%5B%22bottom%22%5D.set_position(%22zero%22)%0A%20%20%20%20%20%20%20%20ax.spines%5B%22right%22%5D.set_color(%22none%22)%0A%20%20%20%20%20%20%20%20ax.spines%5B%22top%22%5D.set_color(%22none%22)%0A%20%20%20%20%20%20%20%20ax.xaxis.set_ticks_position(%22bottom%22)%0A%20%20%20%20%20%20%20%20%23%20ax.yaxis.set_ticks_position('right')%0A%20%20%20%20%20%20%20%20ax.set_yticks(%5B60%2C%2045%2C%2030%2C%2015%2C%200%2C%20-15%5D)%0A%20%20%20%20%20%20%20%20ax.set_xticks(%5B-4%2C%20-2%2C%200%2C%202%2C%204%5D)%0A%20%20%20%20%20%20%20%20ax.text(x%3D-1.75%2C%20y%3D62%2C%20s%3D%22x%3D-1%2F3%22)%0A%20%20%20%20%20%20%20%20ax.text(x%3D2.25%2C%20y%3D13%2C%20s%3D%22Start%22)%0A%20%20%20%20%20%20%20%20ax.axvline(x%3D-1%20%2F%203%2C%20linestyle%3D%22%3A%22%2C%20color%3D%22black%22)%0A%0A%20%20%20%20%20%20%20%20%23%20plot%20the%20function%0A%20%20%20%20%20%20%20%20plt.plot(x%2C%20y%2C%20%22r%22)%0A%0A%20%20%20%20%20%20%20%20x_viz%20%3D%20%5B%5D%0A%20%20%20%20%20%20%20%20y_viz%20%3D%20%5B%5D%0A%0A%20%20%20%20%20%20%20%20def%20animate(iterations)%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20x_viz.append(float(%5Bv%20for%20v%20in%20x_iterations%5Biterations%5D.values()%5D%5B0%5D))%0A%20%20%20%20%20%20%20%20%20%20%20%20y_viz.append(float(objective.evalf(subs%3Dx_iterations%5Biterations%5D)))%0A%20%20%20%20%20%20%20%20%20%20%20%20ax.scatter(x_viz%2C%20y_viz%2C%20c%3D%22black%22)%0A%0A%20%20%20%20%20%20%20%20rot_animation%20%3D%20animation.FuncAnimation(%0A%20%20%20%20%20%20%20%20%20%20%20%20fig%2C%20animate%2C%20frames%3Dlen(x_iterations)%2C%20interval%3D500%0A%20%20%20%20%20%20%20%20)%0A%0A%20%20%20%20%20%20%20%20rot_animation.save(%22data%2Fgradient_descent.gif%22%2C%20dpi%3D300)%0A%0A%20%20%20%20gd_visual()%0A%20%20%20%20mo.image(%22data%2Fgradient_descent.gif%22%2C%20height%3D500).center()%0A%20%20%20%20return%20(gd_visual%2C)%0A%0A%0A%40app.cell%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(%0A%20%20%20%20%20%20%20%20r%22%22%22%0A%20%20%20%20%20%20%20%20Gradient%20descent%20and%20first-order%20iterative%20schemes%20are%20notably%20reliable%20in%20their%20performance.%20In%20fact%2C%20gradient%20descent%20algorithms%20are%20primarily%20utilized%20for%20optimization%20of%20loss%20functions%20in%20Neural%20Networks%20and%20ML%20models%2C%20and%20many%20developments%20have%20improved%20the%20efficacy%20of%20these%20algorithms.%20Nevertheless%2C%20they%20are%20still%20using%20limited%20local%20information%20about%20the%20function%20(only%20the%20first%20derivative).%20Thus%2C%20in%20higher%20dimension%20and%20depending%20on%20the%20nature%20of%20the%20objective%20function%20%26%20the%20learning%20rate%2C%20these%20schemes%201)%20can%20have%20a%20slow%20convergence%20rate%20as%20they%20maintain%20a%20linear%20convergence%20rate%20and%202)%20may%20fail%20entirely%20to%20converge.%20Because%20of%20this%2C%20it%20is%20beneficial%20for%20the%20data%20scientist%20to%20expand%20their%20optimization%20arsenal%20for%20more%20complex%20%26%20custom%20optimization%20problems.%0A%0A%20%20%20%20%20%20%20%20%23%23%23%20Second-order%20Iterative%20Schemes%0A%0A%20%20%20%20%20%20%20%20As%20you%20have%20likely%20now%20pieced%20together%2C%20_Second-order%20iterative%20schemes_%20are%20iterative%20schemes%20that%20utilize%20local%20information%20of%20the%20first%20derivatives%20and%20the%20second%20derivatives%20of%20the%20objective%20function.%20Most%20notably%2C%20we%20have%20the%20Newton%20method%20(NM)%2C%20which%20makes%20use%20of%20the%20Hessian%20of%20the%20objective%20function.%20For%20a%20single%20variable%20function%2C%20the%20Hessian%20is%20simply%20the%20second%20derivative.%20Similar%20to%20the%20gradient%2C%20generalizing%20this%20to%20%24n%24%20dimensions%2C%20the%20Hessian%20is%20an%20%24n%20%5Ctimes%20n%24%20symmetrical%20matrix%20of%20the%20second%20order%20partial%20derivatives%20of%20a%20twice%20continuously%20differentiable%20function%20%24f(x)%24%3A%0A%0A%20%20%20%20%20%20%20%20%24%24%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bequation%7D%0A%20%20%20%20%20%20%20%20%5Cmathbf%7BH%7D(%5Cmathbf%7Bx%7D)%3D%5Cnabla%5E2%20f(%5Cmathbf%7Bx%7D)%20%3D%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bbmatrix%7D%20%0A%20%20%20%20%20%20%20%20%5Cfrac%7B%5Cpartial%5E2%20f%7D%7B%5Cpartial%20x_1%5E2%7D%20%26%20%5Cfrac%7B%5Cpartial%5E2%20f%7D%7B%5Cpartial%20x_1%20%5Cpartial%20x_2%7D%20%26%20%5Ccdots%20%26%20%5Cfrac%7B%5Cpartial%5E2%20f%7D%7B%5Cpartial%20x_1%20%5Cpartial%20x_n%7D%20%5C%5C%5B8pt%5D%20%0A%20%20%20%20%20%20%20%20%5Cfrac%7B%5Cpartial%5E2%20f%7D%7B%5Cpartial%20x_2%20%5Cpartial%20x_1%7D%20%26%20%5Cfrac%7B%5Cpartial%5E2%20f%7D%7B%5Cpartial%20x_2%5E2%7D%20%26%20%5Ccdots%20%26%20%5Cfrac%7B%5Cpartial%5E2%20f%7D%7B%5Cpartial%20x_2%20%5Cpartial%20x_n%7D%20%5C%5C%5B8pt%5D%20%0A%20%20%20%20%20%20%20%20%5Cvdots%20%26%20%5Cvdots%20%26%20%5Cddots%20%26%20%5Cvdots%20%5C%5C%5B8pt%5D%20%0A%20%20%20%20%20%20%20%20%5Cfrac%7B%5Cpartial%5E2%20f%7D%7B%5Cpartial%20x_n%20%5Cpartial%20x_1%7D%20%26%20%5Cfrac%7B%5Cpartial%5E2%20f%7D%7B%5Cpartial%20x_n%20%5Cpartial%20x_2%7D%20%26%20%5Ccdots%20%26%20%5Cfrac%7B%5Cpartial%5E2%20f%7D%7B%5Cpartial%20x_n%5E2%7D%0A%20%20%20%20%20%20%20%20%5Cend%7Bbmatrix%7D%0A%20%20%20%20%20%20%20%20%5Ctag%7B8%7D%0A%20%20%20%20%20%20%20%20%5Cend%7Bequation%7D%0A%20%20%20%20%20%20%20%20%24%24%0A%0A%20%20%20%20%20%20%20%20Now%20moving%20on%20to%20derive%20the%20NM%2C%20first%20recall%20the%20first%20order%20necessary%20condition%20for%20a%20minimum%3A%0A%0A%20%20%20%20%20%20%20%20%24%24%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bequation%7D%0A%20%20%20%20%20%20%20%20%5Cnabla%20f(%5Cmathbf%7Bx%7D%5E*)%3D0%0A%20%20%20%20%20%20%20%20%5Ctag%7B9%7D%0A%20%20%20%20%20%20%20%20%5Cend%7Bequation%7D%0A%20%20%20%20%20%20%20%20%24%24%0A%0A%20%20%20%20%20%20%20%20Given%20this%2C%20we%20can%20approximate%20%24%5Cmathbf%7Bx%7D%5E*%24%20using%20a%20Taylor%20Series%20expansion%3A%0A%0A%20%20%20%20%20%20%20%20%24%24%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bequation%7D%0A%20%20%20%20%20%20%20%200%20%3D%20%5Cnabla%20f(%5Cmathbf%7Bx%7D%5E*)%3D%5Cnabla%20f(%5Cmathbf%7Bx%7D_k%20%2B%20%5CDelta)%20%3D%20%5Cnabla%20f(%5Cmathbf%7Bx%7D_k)%20%2B%20%5Cmathbf%7BH%7D(%5Cmathbf%7Bx%7D_k)%5CDelta%5CRightarrow%20%5CDelta%20%3D%20-%5Cmathbf%7BH%7D%5E%7B-1%7D(%5Cmathbf%7Bx%7D_k)%5Cnabla%20f(%5Cmathbf%7Bx%7D_k)%0A%20%20%20%20%20%20%20%20%5Ctag%7B10%7D%0A%20%20%20%20%20%20%20%20%5Cend%7Bequation%7D%0A%20%20%20%20%20%20%20%20%24%24%0A%0A%20%20%20%20%20%20%20%20Each%20iterative%20addition%20of%20%24%5CDelta%24%20is%20an%20expected%20better%20approximation%20of%20%24x%5E*%24.%20Thus%2C%20each%20iterative%20step%20using%20the%20NM%20can%20be%20represented%20as%20follows%3A%0A%0A%20%20%20%20%20%20%20%20%24%24%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bequation%7D%0A%20%20%20%20%20%20%20%20%5Cmathbf%7Bx%7D_%7Bk%2B1%7D%20%3D%20%5Cmathbf%7Bx%7D_k%20-%5Cmathbf%7BH%7D%5E%7B-1%7D(%5Cmathbf%7Bx%7D_k)%5Cnabla%20f(%5Cmathbf%7Bx%7D_k)%0A%20%20%20%20%20%20%20%20%5Ctag%7B11%7D%0A%20%20%20%20%20%20%20%20%5Cend%7Bequation%7D%0A%20%20%20%20%20%20%20%20%24%24%0A%0A%20%20%20%20%20%20%20%20Referring%20back%20to%20our%20quadratic%20example%2C%20setting%20our%20initial%20guess%20to%20%24x%20%3D%203%24%2C%20the%20steps%20would%20look%20as%20follows%3A%0A%0A%20%20%20%20%20%20%20%20%24%24%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bequation%7D%0A%20%20%20%20%20%20%20%20%5Cbegin%7Baligned%7D%0A%20%20%20%20%20%20%20%20%26%20x_0%20%3D%203%2C%20%5Cquad%20%5Cnabla%20f(%5Cmathbf%7Bx%7D)%20%3D%20%5Cfrac%7Bd%7D%7Bdx%7D%20f(x)%20%3D%206x%20%2B%202%2C%20%20%5Cquad%20%5Cmathbf%7BH%7D(%5Cmathbf%7Bx%7D)%20%3D%20%5Cfrac%7Bd%5E2%7D%7Bdx%5E2%7D%20f(x)%20%3D%206%20%5C%5C%5B8pt%5D%0A%20%20%20%20%20%20%20%20%26%20x%5E*%20%3D%20x_1%20%3D%203%20-%20%5Cfrac%7B1%7D%7B6%7D(6%20%5Ctimes%203%20%2B%202)%20%3D%203%20-%20%5Cfrac%7B20%7D%7B6%7D%20%3D%20-%5Cfrac%7B1%7D%7B3%7D%0A%20%20%20%20%20%20%20%20%5Cend%7Baligned%7D%0A%20%20%20%20%20%20%20%20%5Ctag%7B12%7D%0A%20%20%20%20%20%20%20%20%5Cend%7Bequation%7D%0A%20%20%20%20%20%20%20%20%24%24%0A%0A%20%20%20%20%20%20%20%20And%20we%2C%20elegantly%2C%20converge%20to%20the%20optimal%20solution%20on%20our%20first%20iteration.%20Note%2C%20the%20convergence%20criteria%20is%20the%20same%20regardless%20of%20scheme.%0A%0A%20%20%20%20%20%20%20%20%3E%20Note%20that%20all%20of%20the%20optimization%20schemes%20suffer%20from%20the%20possibility%20of%20getting%20caught%20in%20a%20relative%20extremum%2C%20rather%20than%20the%20global%20extremum%20(i.e.%2C%20think%20a%20higher%20order%20polynomial%20with%20multiple%20extrema%20(min%E2%80%99s%20and%2For%20max%E2%80%99s)%E2%80%94%20we%20could%20get%20stuck%20in%20one%20relative%20extrema%20when%2C%20in%20reality%2C%20another%20extrema%20may%20be%20globally%20more%20optimal%20for%20our%20problem).%20There%20are%20methods%20developed%2C%20and%20always%20being%20developed%2C%20for%20dealing%20with%20global%20optimization%2C%20which%20we%20will%20not%20dive%20too%20deep%20into.%20You%20can%20use%20prior%20knowledge%20of%20the%20functional%20form%20to%20set%20expectations%20of%20what%20results%20you%20anticipate%20(i.e.%2C%20If%20a%20strictly%20convex%20function%20has%20a%20critical%20point%2C%20then%20it%20must%20be%20a%20global%20minimum).%20**Nevertheless%2C%20as%20a%20general%20rule%20of%20thumb%2C%20it%20is%20always%20wise%20to%20iterate%20optimization%20schemes%20over%20different%20possible%20starting%20values%20of%20x%20and%20then%20study%20the%20stability%20of%20results%2C%20usually%20picking%20the%20results%20with%20the%20most%20optimal%20functional%20values%20for%20the%20problem%20at%20hand.**%0A%0A%20%20%20%20%20%20%20%20%23%23%20Newton's%20Method%20in%20a%20Multi-Dimensional%20Example%20-%20Rosenbrock's%20Parabolic%20Valley%0A%0A%20%20%20%20%20%20%20%20Let%E2%80%99s%20now%20consider%20the%20following%20optimization%20problem%20of%20two%20variables%3A%0A%0A%20%20%20%20%20%20%20%20%24%24%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bequation%7D%0A%20%20%20%20%20%20%20%20%5Cmin_%7B%5CGamma%7D%20%3D%20100(y-x%5E2)%5E2%2B(1-x)%5E2%2C%20%5CGamma%3D%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bbmatrix%7D%0A%20%20%20%20%20%20%20%20x%20%5C%5C%20y%0A%20%20%20%20%20%20%20%20%5Cend%7Bbmatrix%7D%0A%20%20%20%20%20%20%20%20%5Cin%20%5Cmathbb%7BR%7D%5E2%0A%20%20%20%20%20%20%20%20%5Ctag%7B13%7D%0A%20%20%20%20%20%20%20%20%5Cend%7Bequation%7D%0A%20%20%20%20%20%20%20%20%24%24%0A%20%20%20%20%20%20%20%20%22%22%22%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(go%2C%20mo%2C%20np)%3A%0A%20%20%20%20def%20rosenbrocks_viz_3d()%3A%0A%20%20%20%20%20%20%20%20x%20%3D%20np.linspace(-4%2C%204%2C%20100)%0A%20%20%20%20%20%20%20%20y%20%3D%20np.linspace(-4%2C%204%2C%20100)%0A%20%20%20%20%20%20%20%20X%2C%20Y%20%3D%20np.meshgrid(x%2C%20y)%0A%20%20%20%20%20%20%20%20Z%20%3D%20100%20*%20(Y%20-%20X**2)%20**%202%20%2B%20(1%20-%20X)%20**%202%0A%0A%20%20%20%20%20%20%20%20fig%20%3D%20go.Figure()%0A%20%20%20%20%20%20%20%20fig.add_trace(go.Surface(%0A%20%20%20%20%20%20%20%20%20%20%20%20z%3DZ%2C%20x%3DX%2C%20y%3DY%2C%20colorscale%3D%22plasma%22%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20colorbar%3Ddict(x%3D-0.15)%20%0A%20%20%20%20%20%20%20%20))%0A%0A%20%20%20%20%20%20%20%20%23%20Add%20the%20optimum%20point%0A%20%20%20%20%20%20%20%20fig.add_trace(%0A%20%20%20%20%20%20%20%20%20%20%20%20go.Scatter3d(%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20x%3D%5B1%5D%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20y%3D%5B1%5D%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20z%3D%5B0%5D%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20mode%3D'markers'%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20marker%3Ddict(size%3D8%2C%20color%3D'green'%2C%20symbol%3D'circle')%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20name%3D'Optimum%20(1%2C1)'%0A%20%20%20%20%20%20%20%20%20%20%20%20)%0A%20%20%20%20%20%20%20%20)%0A%0A%20%20%20%20%20%20%20%20fig.update_layout(%0A%20%20%20%20%20%20%20%20%20%20%20%20title%3D%22Rosenbrock's%20Parabolic%20Valley%22%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20scene%3Ddict(%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20xaxis_title%3D%22X%22%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20yaxis_title%3D%22Y%22%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20zaxis_title%3D%22Z%22%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20xaxis%3Ddict(tickvals%3Dlist(range(-4%2C%205)))%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20yaxis%3Ddict(tickvals%3Dlist(range(-4%2C%205)))%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20)%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20showlegend%3DTrue%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20margin%3Ddict(l%3D0%2C%20r%3D0%2C%20b%3D0%2C%20t%3D40)%2C%0A%20%20%20%20%20%20%20%20)%0A%0A%20%20%20%20%20%20%20%20fig.write_image('data%2Frosenbrocks_viz_3d.webp'%2C%20format%3D'webp'%2C%20scale%3D5)%0A%20%20%20%20%20%20%20%20fig.write_html(%22data%2Frosenbrocks_viz_3d.html%22)%0A%0A%20%20%20%20rosenbrocks_viz_3d()%0A%20%20%20%20mo.image(%22data%2Frosenbrocks_viz_3d.webp%22%2C%20height%3D500).center()%0A%20%20%20%20%23%20display_iframe(%22data%2Frosenbrocks_viz_3d.html%22)%0A%20%20%20%20return%20(rosenbrocks_viz_3d%2C)%0A%0A%0A%40app.cell%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%5BView%20Interactive%20Plotly%20Graph%5D(%2Farticles%2Fnotebooks%2Fdata%2Frosenbrocks_viz_3d.html)%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo%2C%20np%2C%20plt)%3A%0A%20%20%20%20def%20rosenbrocks_viz_contour()%3A%0A%20%20%20%20%20%20%20%20%23%20Define%20the%20Rosenbrock%20function%0A%20%20%20%20%20%20%20%20def%20rosenbrock(x%2C%20y)%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20return%20100%20*%20(y%20-%20x**2)%20**%202%20%2B%20(1%20-%20x)%20**%202%0A%0A%20%20%20%20%20%20%20%20%23%20Compute%20gradient%0A%20%20%20%20%20%20%20%20def%20grad_rosenbrock(x%2C%20y)%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20df_dx%20%3D%20-400%20*%20x%20*%20(y%20-%20x**2)%20-%202%20*%20(1%20-%20x)%0A%20%20%20%20%20%20%20%20%20%20%20%20df_dy%20%3D%20200%20*%20(y%20-%20x**2)%0A%20%20%20%20%20%20%20%20%20%20%20%20return%20df_dx%2C%20df_dy%0A%0A%20%20%20%20%20%20%20%20%23%20Define%20the%20grid%0A%20%20%20%20%20%20%20%20x_vals%20%3D%20np.linspace(-4%2C%204%2C%20100)%0A%20%20%20%20%20%20%20%20y_vals%20%3D%20np.linspace(-4%2C%204%2C%20100)%0A%20%20%20%20%20%20%20%20X%2C%20Y%20%3D%20np.meshgrid(x_vals%2C%20y_vals)%0A%20%20%20%20%20%20%20%20Z%20%3D%20rosenbrock(X%2C%20Y)%0A%0A%20%20%20%20%20%20%20%20%23%20Compute%20gradients%20for%20quiver%20plot%0A%20%20%20%20%20%20%20%20dX%2C%20dY%20%3D%20grad_rosenbrock(X%2C%20Y)%0A%0A%20%20%20%20%20%20%20%20%23%20Plot%20contours%20of%20Rosenbrock%20function%0A%20%20%20%20%20%20%20%20plt.figure(dpi%3D125)%0A%20%20%20%20%20%20%20%20contour%20%3D%20plt.contour(X%2C%20Y%2C%20Z%2C%20levels%3D50%2C%20cmap%3D%22plasma%22)%0A%20%20%20%20%20%20%20%20plt.colorbar(contour)%0A%0A%20%20%20%20%20%20%20%20%23%20Overlay%20gradient%20field%0A%20%20%20%20%20%20%20%20plt.quiver(X%2C%20Y%2C%20dX%2C%20dY%2C%20color%3D%22red%22%2C%20alpha%3D0.6)%0A%0A%20%20%20%20%20%20%20%20%23%20Mark%20the%20optimization%20point%20(theoretical%20minimum%20at%20(1%2C1))%0A%20%20%20%20%20%20%20%20plt.scatter(%0A%20%20%20%20%20%20%20%20%20%20%20%201%2C%201%2C%20color%3D%22green%22%2C%20marker%3D%22o%22%2C%20s%3D100%2C%20label%3D%22Optimum%20(1%2C1)%22%2C%20zorder%3D3%0A%20%20%20%20%20%20%20%20)%0A%0A%20%20%20%20%20%20%20%20%23%20Labels%20and%20legend%0A%20%20%20%20%20%20%20%20plt.xlabel(%22X%22)%0A%20%20%20%20%20%20%20%20plt.ylabel(%22Y%22)%0A%20%20%20%20%20%20%20%20plt.title(%22Contour%20Representation%22)%0A%20%20%20%20%20%20%20%20plt.legend(loc%3D%22lower%20center%22%2C%20bbox_to_anchor%3D(0.5%2C%20-0.25))%0A%20%20%20%20%20%20%20%20plt.savefig(%22data%2Frosenbrocks_viz_contour.webp%22%2C%20format%3D%22webp%22%2C%20dpi%3D300%2C%20bbox_inches%3D'tight')%0A%0A%20%20%20%20rosenbrocks_viz_contour()%0A%20%20%20%20mo.image(%22data%2Frosenbrocks_viz_contour.webp%22%2C%20height%3D600).center()%0A%20%20%20%20return%20(rosenbrocks_viz_contour%2C)%0A%0A%0A%40app.cell%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(%0A%20%20%20%20%20%20%20%20r%22%22%22%0A%20%20%20%20%20%20%20%20We%20will%20first%20solve%20the%20above%20optimization%20problem%20first%20by%20hand%20and%20then%20in%20python%2C%20both%20utilizing%20Newton%E2%80%99s%20Method.%0A%0A%20%20%20%20%20%20%20%20%23%23%23%20Solving%20by%20Hand%0A%0A%20%20%20%20%20%20%20%20To%20solve%20by%20hand%2C%20we%20will%20need%20to%20solve%20for%20the%20gradient%2C%20solve%20for%20the%20Hessian%2C%20choose%20our%20initial%20guess%20%24%5CGamma%20%3D%20%5Bx%2C%20y%5D%24%2C%20and%20then%20iterate%20plugging%20this%20information%20into%20the%20NM%20algorithm%20until%20convergence%20is%20achieved.%20First%2C%20solving%20for%20the%20gradient%2C%20we%20have%3A%0A%0A%20%20%20%20%20%20%20%20%24%24%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bequation%7D%0A%20%20%20%20%20%20%20%20%5Cnabla%20f(%5CGamma)%3D%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bbmatrix%7D%20%0A%20%20%20%20%20%20%20%20%5Cfrac%7B%5Cpartial%7Bf%7D%7D%7B%5Cpartial%7Bx%7D%7D(%5CGamma)%20%5C%5C%5B6pt%5D%0A%20%20%20%20%20%20%20%20%5Cfrac%7B%5Cpartial%7Bf%7D%7D%7B%5Cpartial%7By%7D%7D(%5CGamma)%20%5C%5C%0A%20%20%20%20%20%20%20%20%5Cend%7Bbmatrix%7D%20%3D%20%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bbmatrix%7D%20%0A%20%20%20%20%20%20%20%20200(y-x%5E2)(-2x)-2(1-x)%20%5C%5C%5B6pt%5D%0A%20%20%20%20%20%20%20%20200(y-x%5E2)%20%5C%5C%0A%20%20%20%20%20%20%20%20%5Cend%7Bbmatrix%7D%0A%20%20%20%20%20%20%20%20%5Ctag%7B14%7D%0A%20%20%20%20%20%20%20%20%5Cend%7Bequation%7D%0A%20%20%20%20%20%20%20%20%24%24%0A%0A%20%20%20%20%20%20%20%20Solving%20for%20the%20Hessian%2C%20we%20have%3A%0A%0A%20%20%20%20%20%20%20%20%24%24%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bequation%7D%0A%20%20%20%20%20%20%20%20%5Cmathbf%7BH%7D(%5CGamma)%3D%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bbmatrix%7D%20%0A%20%20%20%20%20%20%20%20%5Cfrac%7B%5Cpartial%5E2%7Bf%7D%7D%7B%5Cpartial%7Bx%5E2%7D%7D(%5CGamma)%20%26%20%5Cfrac%7B%5Cpartial%5E2%7Bf%7D%7D%7B%5Cpartial%7Bx%7D%5Cpartial%7By%7D%7D(%5CGamma)%5C%5C%5B6pt%5D%0A%20%20%20%20%20%20%20%20%5Cfrac%7B%5Cpartial%5E2%7Bf%7D%7D%7B%5Cpartial%7By%7D%5Cpartial%7Bx%7D%7D(%5CGamma)%20%26%20%5Cfrac%7B%5Cpartial%5E2%7Bf%7D%7D%7B%5Cpartial%7By%5E2%7D%7D(%5CGamma)%5C%5C%0A%20%20%20%20%20%20%20%20%5Cend%7Bbmatrix%7D%20%3D%20%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bbmatrix%7D%20%0A%20%20%20%20%20%20%20%20-400y%2B1200x%5E2%2B2%20%26%20-400x%20%5C%5C%5B6pt%5D%0A%20%20%20%20%20%20%20%20-400x%20%26%20200%20%5C%5C%0A%20%20%20%20%20%20%20%20%5Cend%7Bbmatrix%7D%0A%20%20%20%20%20%20%20%20%5Ctag%7B15%7D%0A%20%20%20%20%20%20%20%20%5Cend%7Bequation%7D%0A%20%20%20%20%20%20%20%20%24%24%0A%0A%20%20%20%20%20%20%20%20Setting%20our%20initial%20guess%20to%20%24%5CGamma%20%3D%20%5B-1.2%2C1%5D%24%2C%20we%20have%3A%0A%0A%20%20%20%20%20%20%20%20%24%24%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bequation%7D%0A%20%20%20%20%20%20%20%20%5Cbegin%7Baligned%7D%0A%20%20%20%20%20%20%20%20%5CGamma_0%20%26%3D%20%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bbmatrix%7D%20-1.2%20%5C%5C%201%20%5Cend%7Bbmatrix%7D%20%5C%5C%5B16pt%5D%0A%20%20%20%20%20%20%20%20%5CGamma_1%20%26%3D%20%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bbmatrix%7D%20-1.2%20%5C%5C%201%20%5Cend%7Bbmatrix%7D%20%0A%20%20%20%20%20%20%20%20-%20%5Cbegin%7Bbmatrix%7D%201330%20%26%20480%20%5C%5C%20480%20%26%20200%20%5Cend%7Bbmatrix%7D%5E%7B-1%7D%20%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bbmatrix%7D%20-215.6%20%5C%5C%20-88%20%5Cend%7Bbmatrix%7D%20%0A%20%20%20%20%20%20%20%20%26%3D%20%5Cbegin%7Bbmatrix%7D%20-1.175%20%5C%5C%201.381%20%5Cend%7Bbmatrix%7D%20%5C%5C%5B10pt%5D%0A%20%20%20%20%20%20%20%20%5CGamma_2%20%26%3D%20%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bbmatrix%7D%20-1.175%20%5C%5C%201.381%20%5Cend%7Bbmatrix%7D%20%0A%20%20%20%20%20%20%20%20-%20%5Cbegin%7Bbmatrix%7D%201107.27%20%26%20470.11%20%5C%5C%20470.11%20%26%20200%20%5Cend%7Bbmatrix%7D%5E%7B-1%7D%20%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bbmatrix%7D%20-4.634%20%5C%5C%20-0.122%20%5Cend%7Bbmatrix%7D%20%0A%20%20%20%20%20%20%20%20%26%3D%20%5Cbegin%7Bbmatrix%7D%200.763%20%5C%5C%20-3.175%20%5Cend%7Bbmatrix%7D%20%5C%5C%5B10pt%5D%0A%20%20%20%20%20%20%20%20%5CGamma_3%20%26%3D%20%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bbmatrix%7D%200.763%20%5C%5C%20-3.175%20%5Cend%7Bbmatrix%7D%20%0A%20%20%20%20%20%20%20%20-%20%5Cbegin%7Bbmatrix%7D%201970.83%20%26%20-305.25%20%5C%5C%20-305.25%20%26%20200%20%5Cend%7Bbmatrix%7D%5E%7B-1%7D%20%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bbmatrix%7D%201146.45%20%5C%5C%20-751.48%20%5Cend%7Bbmatrix%7D%20%0A%20%20%20%20%20%20%20%20%26%3D%20%5Cbegin%7Bbmatrix%7D%200.763%20%5C%5C%200.583%20%5Cend%7Bbmatrix%7D%20%5C%5C%5B10pt%5D%0A%20%20%20%20%20%20%20%20%5CGamma_4%20%26%3D%20%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bbmatrix%7D%200.763%20%5C%5C%200.583%20%5Cend%7Bbmatrix%7D%20%0A%20%20%20%20%20%20%20%20-%20%5Cbegin%7Bbmatrix%7D%20468.26%20%26%20-305.37%20%5C%5C%20-305.37%20%26%20200%20%5Cend%7Bbmatrix%7D%5E%7B-1%7D%20%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bbmatrix%7D%20-0.473%20%5C%5C%20-0.00002%20%5Cend%7Bbmatrix%7D%20%0A%20%20%20%20%20%20%20%20%26%3D%20%5Cbegin%7Bbmatrix%7D%200.999%20%5C%5C%200.944%20%5Cend%7Bbmatrix%7D%20%5C%5C%5B10pt%5D%0A%20%20%20%20%20%20%20%20%5CGamma_5%20%26%3D%20%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bbmatrix%7D%200.999%20%5C%5C%200.944%20%5Cend%7Bbmatrix%7D%20%0A%20%20%20%20%20%20%20%20-%20%5Cbegin%7Bbmatrix%7D%20824.38%20%26%20-400%20%5C%5C%20-400%20%26%20200%20%5Cend%7Bbmatrix%7D%5E%7B-1%7D%20%0A%20%20%20%20%20%20%20%20%5Cbegin%7Bbmatrix%7D%2022.39%20%5C%5C%20-11.19%20%5Cend%7Bbmatrix%7D%20%0A%20%20%20%20%20%20%20%20%26%3D%20%5Cbegin%7Bbmatrix%7D%200.999%20%5C%5C%200.999%20%5Cend%7Bbmatrix%7D%20%5C%5C%5B16pt%5D%0A%20%20%20%20%20%20%20%20%5CGamma_5%20%26%5Capprox%20%5CGamma%5E*%20%3D%20%5Cbegin%7Bbmatrix%7D%20x%5E*%20%5C%5C%20y%5E*%20%5Cend%7Bbmatrix%7D%20%3D%20%5Cbegin%7Bbmatrix%7D%201%20%5C%5C%201%20%5Cend%7Bbmatrix%7D%0A%20%20%20%20%20%20%20%20%5Cend%7Baligned%7D%0A%20%20%20%20%20%20%20%20%5Ctag%7B16%7D%0A%20%20%20%20%20%20%20%20%5Cend%7Bequation%7D%0A%20%20%20%20%20%20%20%20%24%24%0A%0A%20%20%20%20%20%20%20%20Thus%2C%20we%20successfully%20solve%20for%20the%20optimal%20minimum%20of%20our%20objective%20function%20at%20%24%5CGamma%5E*%20%3D%20%5B1%2C1%5D%24.%0A%0A%20%20%20%20%20%20%20%20%23%23%23%20Solving%20in%20Python%20using%20SymPy%0A%0A%20%20%20%20%20%20%20%20%3E%20Note%2C%20this%20is%20by%20no%20means%20meant%20to%20be%20an%20efficient%20implementation%2C%20but%20rather%20for%20demonstration.%20There%20are%20many%20optimization%20frameworks%20%26%20tools%20that%20are%20optimized%20heavily%20for%20efficient%20implementations%2C%20such%20as%20%5BSciPy%5D(https%3A%2F%2Fscipy.org%2F)%20%26%20%5BPyomo%5D(https%3A%2F%2Fwww.pyomo.org%2F).%0A%0A%0A%20%20%20%20%20%20%20%20We%20will%20now%20turn%20to%20solving%20this%20problem%2C%20and%20generalizing%20it%20to%20any%20function%2C%20in%20python%20using%20%5BSymPy%5D(https%3A%2F%2Fwww.sympy.org%2Fen%2Findex.html)%20%E2%80%94%20a%20python%20library%20for%20symbolic%20mathematics.%20First%2C%20let%E2%80%99s%20walk%20through%20defining%20Rosenbrock%E2%80%99s%20parabolic%20valley%20and%20calculating%20the%20gradient%20%26%20Hessian%20of%20the%20function%3A%0A%20%20%20%20%20%20%20%20%22%22%22%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(np%2C%20sm)%3A%0A%20%20%20%20%23%20Define%20symbols%20%26%20objective%20function%20(Rosenbrock's%20Parabolic%20Valley)%0A%20%20%20%20x%2C%20y%20%3D%20sm.symbols(%22x%20y%22)%0A%20%20%20%20Gamma%20%3D%20%5Bx%2C%20y%5D%0A%20%20%20%20objective%20%3D%20100%20*%20(y%20-%20x**2)%20**%202%20%2B%20(1%20-%20x)%20**%202%0A%0A%20%20%20%20def%20get_gradient_sym(%0A%20%20%20%20%20%20%20%20function%3A%20sm.Expr%2C%0A%20%20%20%20%20%20%20%20symbols%3A%20list%5Bsm.Symbol%5D%2C%0A%20%20%20%20)%20-%3E%20np.ndarray%3A%0A%20%20%20%20%20%20%20%20%22%22%22%0A%20%20%20%20%20%20%20%20Calculate%20the%20gradient%20of%20a%20function.%0A%0A%20%20%20%20%20%20%20%20Args%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20function%20(sm.Expr)%3A%20The%20function%20to%20calculate%20the%20gradient%20of.%0A%20%20%20%20%20%20%20%20%20%20%20%20symbols%20(list%5Bsm.Symbol%5D)%3A%20The%20symbols%20representing%20the%20variables%20in%20the%20function.%0A%0A%20%20%20%20%20%20%20%20Returns%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20numpy.ndarray%3A%20The%20gradient%20of%20the%20function.%0A%20%20%20%20%20%20%20%20%22%22%22%0A%20%20%20%20%20%20%20%20d1%20%3D%20%7B%7D%0A%20%20%20%20%20%20%20%20gradient%20%3D%20np.array(%5B%5D)%0A%0A%20%20%20%20%20%20%20%20for%20i%20in%20symbols%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20d1%5Bi%5D%20%3D%20sm.diff(function%2C%20i%2C%201)%0A%20%20%20%20%20%20%20%20%20%20%20%20gradient%20%3D%20np.append(gradient%2C%20d1%5Bi%5D)%0A%0A%20%20%20%20%20%20%20%20return%20gradient%0A%0A%20%20%20%20def%20get_hessian_sym(%0A%20%20%20%20%20%20%20%20function%3A%20sm.Expr%2C%0A%20%20%20%20%20%20%20%20symbols%3A%20list%5Bsm.Symbol%5D%2C%0A%20%20%20%20)%20-%3E%20np.ndarray%3A%0A%20%20%20%20%20%20%20%20%22%22%22%0A%20%20%20%20%20%20%20%20Calculate%20the%20Hessian%20matrix%20of%20a%20function.%0A%0A%20%20%20%20%20%20%20%20Args%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20function%20(sm.Expr)%3A%20The%20function%20for%20which%20the%20Hessian%20matrix%20is%20calculated.%0A%20%20%20%20%20%20%20%20%20%20%20%20symbols%20(list%5Bsm.Symbol%5D)%3A%20The%20list%20of%20symbols%20used%20in%20the%20function.%0A%0A%20%20%20%20%20%20%20%20Returns%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20numpy.ndarray%3A%20The%20Hessian%20matrix%20of%20the%20function.%0A%20%20%20%20%20%20%20%20%22%22%22%0A%20%20%20%20%20%20%20%20d2%20%3D%20%7B%7D%0A%20%20%20%20%20%20%20%20hessian%20%3D%20np.array(%5B%5D)%0A%0A%20%20%20%20%20%20%20%20for%20i%20in%20symbols%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20for%20j%20in%20symbols%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20d2%5Bf%22%7Bi%7D%7Bj%7D%22%5D%20%3D%20sm.diff(function%2C%20i%2C%20j)%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20hessian%20%3D%20np.append(hessian%2C%20d2%5Bf%22%7Bi%7D%7Bj%7D%22%5D)%0A%0A%20%20%20%20%20%20%20%20hessian%20%3D%20np.array(np.array_split(hessian%2C%20len(symbols)))%0A%0A%20%20%20%20%20%20%20%20return%20hessian%0A%20%20%20%20return%20Gamma%2C%20get_gradient_sym%2C%20get_hessian_sym%2C%20objective%2C%20x%2C%20y%0A%0A%0A%40app.cell%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22SymPy%20allows%20us%20to%20investigate%20the%20symbolic%20representation%20of%20our%20equations.%20For%20example%2C%20if%20we%20call%20%60objective%60%20%2C%20we%20will%20see%20the%20corresponding%20output%3A%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(objective)%3A%0A%20%20%20%20objective%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22Additionally%2C%20SymPy%20allows%20us%20take%20the%20derivatives%20of%20the%20respective%20function%20utilizing%20the%20%60sm.diff()%60%20command.%20If%20we%20run%20our%20defined%20functions%20to%20obtain%20the%20gradient%20%60get_gradient_sym(objective%2CGamma)%60%20%2C%20we%20obtain%20a%20numpy%20array%20representing%20the%20gradient%3A%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(Gamma%2C%20get_gradient_sym%2C%20objective)%3A%0A%20%20%20%20get_gradient_sym(objective%2C%20Gamma)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22Accessing%20a%20specific%20element%2C%20we%20can%20see%20the%20symbolic%20representation%20%60get_gradient_sym(objective%2C%20Gamma)%5B0%5D%60%20%3A%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(Gamma%2C%20get_gradient_sym%2C%20objective)%3A%0A%20%20%20%20get_gradient_sym(objective%2C%20Gamma)%5B0%5D%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22Similarly%2C%20for%20the%20Hessian%20we%20can%20call%20%60get_hessian_sym(objective%2C%20Gamma)%60%3A%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(Gamma%2C%20get_hessian_sym%2C%20objective)%3A%0A%20%20%20%20get_hessian_sym(objective%2C%20Gamma)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22Accessing%20a%20specific%20element%20%60get_hessian_sym(objective%2CGamma)%5B0%5D%5B1%5D%60%3A%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(Gamma%2C%20get_hessian_sym%2C%20objective)%3A%0A%20%20%20%20get_hessian_sym(objective%2C%20Gamma)%5B0%5D%5B1%5D%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22One%20can%20easily%20verify%20that%20the%20gradient%20and%20Hessian%20are%20identical%20to%20the%20ones%20we%20solved%20out%20by%20hand.%20SymPy%20allows%20for%20the%20evaluation%20of%20any%20function%20given%20specified%20values%20for%20the%20symbols.%20For%20example%2C%20we%20can%20evaluate%20the%20gradient%20at%20our%20initial%20guess%20by%20tweaking%20the%20function%20as%20follows%3A%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(np%2C%20sm)%3A%0A%20%20%20%20def%20get_gradient(%0A%20%20%20%20%20%20%20%20function%3A%20sm.Expr%2C%0A%20%20%20%20%20%20%20%20symbols%3A%20list%5Bsm.Symbol%5D%2C%0A%20%20%20%20%20%20%20%20x0%3A%20dict%5Bsm.Symbol%2C%20float%5D%2C%20%20%23%20Add%20x0%20as%20argument%0A%20%20%20%20)%20-%3E%20np.ndarray%3A%0A%20%20%20%20%20%20%20%20%22%22%22%0A%20%20%20%20%20%20%20%20Calculate%20the%20gradient%20of%20a%20function%20at%20a%20given%20point.%0A%0A%20%20%20%20%20%20%20%20Args%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20function%20(sm.Expr)%3A%20The%20function%20to%20calculate%20the%20gradient%20of.%0A%20%20%20%20%20%20%20%20%20%20%20%20symbols%20(list%5Bsm.Symbol%5D)%3A%20The%20symbols%20representing%20the%20variables%20in%20the%20function.%0A%20%20%20%20%20%20%20%20%20%20%20%20x0%20(dict%5Bsm.Symbol%2C%20float%5D)%3A%20The%20point%20at%20which%20to%20calculate%20the%20gradient.%0A%0A%20%20%20%20%20%20%20%20Returns%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20numpy.ndarray%3A%20The%20gradient%20of%20the%20function%20at%20the%20given%20point.%0A%20%20%20%20%20%20%20%20%22%22%22%0A%20%20%20%20%20%20%20%20d1%20%3D%20%7B%7D%0A%20%20%20%20%20%20%20%20gradient%20%3D%20np.array(%5B%5D)%0A%0A%20%20%20%20%20%20%20%20for%20i%20in%20symbols%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20d1%5Bi%5D%20%3D%20sm.diff(function%2C%20i%2C%201).evalf(subs%3Dx0)%20%20%23%20add%20evalf%20method%0A%20%20%20%20%20%20%20%20%20%20%20%20gradient%20%3D%20np.append(gradient%2C%20d1%5Bi%5D)%0A%0A%20%20%20%20%20%20%20%20return%20gradient.astype(np.float64)%20%20%23%20Change%20data%20type%20to%20float%0A%0A%20%20%20%20def%20get_hessian(%0A%20%20%20%20%20%20%20%20function%3A%20sm.Expr%2C%0A%20%20%20%20%20%20%20%20symbols%3A%20list%5Bsm.Symbol%5D%2C%0A%20%20%20%20%20%20%20%20x0%3A%20dict%5Bsm.Symbol%2C%20float%5D%2C%0A%20%20%20%20)%20-%3E%20np.ndarray%3A%0A%20%20%20%20%20%20%20%20%22%22%22%0A%20%20%20%20%20%20%20%20Calculate%20the%20Hessian%20matrix%20of%20a%20function%20at%20a%20given%20point.%0A%0A%20%20%20%20%20%20%20%20Args%3A%0A%20%20%20%20%20%20%20%20function%20(sm.Expr)%3A%20The%20function%20for%20which%20the%20Hessian%20matrix%20is%20calculated.%0A%20%20%20%20%20%20%20%20symbols%20(list%5Bsm.Symbol%5D)%3A%20The%20list%20of%20symbols%20used%20in%20the%20function.%0A%20%20%20%20%20%20%20%20x0%20(dict%5Bsm.Symbol%2C%20float%5D)%3A%20The%20point%20at%20which%20the%20Hessian%20matrix%20is%20evaluated.%0A%0A%20%20%20%20%20%20%20%20Returns%3A%0A%20%20%20%20%20%20%20%20numpy.ndarray%3A%20The%20Hessian%20matrix%20of%20the%20function%20at%20the%20given%20point.%0A%20%20%20%20%20%20%20%20%22%22%22%0A%20%20%20%20%20%20%20%20d2%20%3D%20%7B%7D%0A%20%20%20%20%20%20%20%20hessian%20%3D%20np.array(%5B%5D)%0A%0A%20%20%20%20%20%20%20%20for%20i%20in%20symbols%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20for%20j%20in%20symbols%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20d2%5Bf%22%7Bi%7D%7Bj%7D%22%5D%20%3D%20sm.diff(function%2C%20i%2C%20j).evalf(subs%3Dx0)%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20hessian%20%3D%20np.append(hessian%2C%20d2%5Bf%22%7Bi%7D%7Bj%7D%22%5D)%0A%0A%20%20%20%20%20%20%20%20hessian%20%3D%20np.array(np.array_split(hessian%2C%20len(symbols)))%0A%0A%20%20%20%20%20%20%20%20return%20hessian.astype(np.float64)%0A%20%20%20%20return%20get_gradient%2C%20get_hessian%0A%0A%0A%40app.cell%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22We%20can%20now%20compute%20our%20gradient%20given%20our%20starting%20point%20by%20calling%20%60get_gradient(objective%2C%20Gamma%2C%20%7Bx%3A-1.2%2Cy%3A1.0%7D)%60%3A%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(Gamma%2C%20get_gradient%2C%20objective%2C%20x%2C%20y)%3A%0A%20%20%20%20get_gradient(objective%2C%20Gamma%2C%20%7Bx%3A%20-1.2%2C%20y%3A%201.0%7D)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22Similarly%2C%20for%20the%20Hessian%20%60get_hessian(objective%2C%20Gamma%2C%20%7Bx%3A-1.2%2Cy%3A1.0%7D)%60%3A%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(Gamma%2C%20get_hessian%2C%20objective%2C%20x%2C%20y)%3A%0A%20%20%20%20get_hessian(objective%2C%20Gamma%2C%20%7Bx%3A%20-1.2%2C%20y%3A%201.0%7D)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22Again%2C%20we%20can%20verify%20that%20these%20values%20are%20correct%20from%20our%20work%20by%20hand%20above._%20Now%20we%20have%20all%20the%20ingredients%20necessary%20to%20code%20Newton%E2%80%99s%20method%20(the%20code%20for%20gradient%20descent%20is%20given%20at%20the%20end%20of%20this%20article%20as%20well)%3A%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(get_gradient%2C%20get_hessian%2C%20np%2C%20sm)%3A%0A%20%20%20%20def%20newtons_method(%0A%20%20%20%20%20%20%20%20function%3A%20sm.Expr%2C%0A%20%20%20%20%20%20%20%20symbols%3A%20list%5Bsm.Symbol%5D%2C%0A%20%20%20%20%20%20%20%20x0%3A%20dict%5Bsm.Symbol%2C%20float%5D%2C%0A%20%20%20%20%20%20%20%20iterations%3A%20int%20%3D%20100%2C%0A%20%20%20%20%20%20%20%20tolerance%3A%20float%20%3D%2010e-5%2C%0A%20%20%20%20%20%20%20%20verbose%3A%20int%20%3D%201%2C%0A%20%20%20%20)%20-%3E%20dict%5Bsm.Symbol%2C%20float%5D%20or%20None%3A%0A%20%20%20%20%20%20%20%20%22%22%22%0A%20%20%20%20%20%20%20%20Perform%20Newton's%20method%20to%20find%20the%20solution%20to%20the%20optimization%20problem.%0A%0A%20%20%20%20%20%20%20%20Args%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20function%20(sm.Expr)%3A%20The%20objective%20function%20to%20be%20optimized.%0A%20%20%20%20%20%20%20%20%20%20%20%20symbols%20(list%5Bsm.Symbol%5D)%3A%20The%20symbols%20used%20in%20the%20objective%20function.%0A%20%20%20%20%20%20%20%20%20%20%20%20x0%20(dict%5Bsm.Symbol%2C%20float%5D)%3A%20The%20initial%20values%20for%20the%20symbols.%0A%20%20%20%20%20%20%20%20%20%20%20%20iterations%20(int%2C%20optional)%3A%20The%20maximum%20number%20of%20iterations.%20Defaults%20to%20100.%0A%20%20%20%20%20%20%20%20%20%20%20%20tolerance%20(float%2C%20optional)%3A%20Threshold%20for%20determining%20convergence.%0A%20%20%20%20%20%20%20%20%20%20%20%20verbose%20(int%2C%20optional)%3A%20Control%20verbosity%20of%20output.%200%20is%20no%20output%2C%201%20is%20full%20output.%0A%0A%20%20%20%20%20%20%20%20Returns%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20dict%5Bsm.Symbol%2C%20float%5D%20or%20None%3A%20The%20solution%20to%20the%20optimization%20problem%2C%20or%20None%20if%20no%20solution%20is%20found.%0A%20%20%20%20%20%20%20%20%22%22%22%0A%0A%20%20%20%20%20%20%20%20x_star%20%3D%20%7B%7D%0A%20%20%20%20%20%20%20%20x_star%5B0%5D%20%3D%20np.array(list(x0.values()))%0A%0A%20%20%20%20%20%20%20%20if%20verbose%20!%3D%200%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20print(f%22Starting%20Values%3A%20%7Bx_star%5B0%5D%7D%22)%0A%0A%20%20%20%20%20%20%20%20for%20i%20in%20range(iterations)%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20gradient%20%3D%20get_gradient(function%2C%20symbols%2C%20dict(zip(x0.keys()%2C%20x_star%5Bi%5D)))%0A%20%20%20%20%20%20%20%20%20%20%20%20hessian%20%3D%20get_hessian(function%2C%20symbols%2C%20dict(zip(x0.keys()%2C%20x_star%5Bi%5D)))%0A%0A%20%20%20%20%20%20%20%20%20%20%20%20x_star%5Bi%20%2B%201%5D%20%3D%20x_star%5Bi%5D.T%20-%20np.linalg.inv(hessian)%20%40%20gradient.T%0A%0A%20%20%20%20%20%20%20%20%20%20%20%20if%20np.linalg.norm(x_star%5Bi%20%2B%201%5D%20-%20x_star%5Bi%5D)%20%3C%20tolerance%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20solution%20%3D%20dict(zip(x0.keys()%2C%20%5Bfloat(x)%20for%20x%20in%20x_star%5Bi%20%2B%201%5D%5D))%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20if%20verbose%20!%3D%200%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20print(%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20f%22%5CnConvergence%20Achieved%20(%7Bi%2B1%7D%20iterations)%3A%20Solution%20%3D%20%7Bsolution%7D%22%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20)%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20break%0A%20%20%20%20%20%20%20%20%20%20%20%20else%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20solution%20%3D%20None%0A%0A%20%20%20%20%20%20%20%20%20%20%20%20if%20verbose%20!%3D%200%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20print(f%22Step%20%7Bi%2B1%7D%3A%20%7Bx_star%5Bi%2B1%5D%7D%22)%0A%0A%20%20%20%20%20%20%20%20return%20solution%0A%20%20%20%20return%20(newtons_method%2C)%0A%0A%0A%40app.cell%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22We%20can%20now%20run%20the%20code%20via%20%60newtons_method(objective%2CGamma%2C%7Bx%3A-1.2%2Cy%3A1%7D)%60%3A%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(Gamma%2C%20newtons_method%2C%20objective%2C%20x%2C%20y)%3A%0A%20%20%20%20_%20%3D%20newtons_method(objective%2C%20Gamma%2C%20%7Bx%3A%20-1.2%2C%20y%3A%201%7D)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(%0A%20%20%20%20%20%20%20%20r%22%22%22%0A%20%20%20%20%20%20%20%20%23%23%20Conclusion%0A%0A%20%20%20%20%20%20%20%20There%20you%20have%20it!%20If%20you%20have%20made%20it%20up%20to%20this%20point%2C%20you%20now%20have%20a%20robust%20understanding%20of%20how%20to%20think%20about%20and%20abstractly%20formulate%20unconstrained%20mathematical%20optimization%20problems%2C%20along%20with%20the%20basic%20analytical%20approach%20and%20the%20more%20complex%20iterative%20methods%20for%20solving%20such%20problems.%20It%20is%20clear%20that%20the%20more%20information%20that%20we%20can%20incorporate%20about%20the%20function%20in%20the%20iterative%20schemes%20(i.e.%2C%20higher%20order%20derivatives)%2C%20the%20more%20efficient%20the%20convergence%20rate.%20_Note%20that%20we%20are%20just%20brushing%20the%20surface%20of%20the%20complex%20world%20that%20is%20mathematical%20optimization._%20Nevertheless%2C%20the%20tools%20we%20have%20discussed%20today%20can%20absolutely%20be%20utilized%20in%20practice%20and%20extended%20to%20higher%20dimensional%20optimization%20problems.%0A%0A%20%20%20%20%20%20%20%20Stay%20tuned%20for%20Part%202%20of%20this%20series%20where%20we%20will%20extend%20what%20we%20have%20learned%20here%20to%20solving%20constrained%20optimization%20problems%20%E2%80%94%20which%20is%20an%20extremely%20practical%20extension%20on%20unconstrained%20optimization.%20In%20fact%2C%20most%20real%20world%20optimization%20problems%20will%20have%20some%20form%20of%20constraints%20on%20the%20choice%20variables.%20Then%20we%20will%20shift%20to%20Part%203%20of%20this%20series%20where%20we%20will%20apply%20the%20optimization%20theory%20learned%20and%20additional%20econometric%20%26%20economic%20theory%20to%20solve%20a%20simple%20example%20of%20profit%20maximization%20problem.%20I%20hope%20you%20have%20enjoyed%20reading%20this%20as%20much%20as%20I%20have%20enjoyed%20writing%20it!%0A%0A%20%20%20%20%20%20%20%20%23%23%20Bonus%20-%20The%20Pitfalls%20of%20Newton's%20Method%0A%0A%20%20%20%20%20%20%20%20Despite%20the%20attractiveness%20of%20Newton%E2%80%99s%20method%2C%20it%20is%20not%20without%20its%20own%20pitfalls.%20Notably%2C%20two%20main%20pitfalls%20exists%20%E2%80%94%201)%20NM%20is%20not%20always%20convergent%20even%20when%20choosing%20starting%20points%20near%20the%20solution%20%26%202)%20NM%20requires%20the%20computation%20of%20the%20Hessian%20matrix%20at%20each%20step%20which%20can%20be%20computationally%20very%20expensive%20in%20higher%20dimensions.%20For%20pitfall%20%231)%2C%20a%20respective%20solution%20is%20the%20Modified%20Newton%20method%20(MNM)%2C%20which%20can%20be%20loosely%20thought%20of%20as%20gradient%20descent%20where%20the%20search%20direction%20is%20given%20by%20the%20Newton%20step%2C%20%CE%94.%20For%20pitfall%20%232)%2C%20quasi-Newton%20methods%2C%20such%20as%20DFP%20or%20BFGS%2C%20have%20been%20proposed%20that%20approximate%20the%20inverse-Hessian%20used%20at%20each%20step%20to%20improve%20computational%20burden.%20For%20more%20information%20see%2C%20%5B1%5D.%0A%0A%20%20%20%20%20%20%20%20%23%23%20Supplementary%20Code%0A%20%20%20%20%20%20%20%20%22%22%22%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(get_gradient%2C%20np%2C%20sm)%3A%0A%20%20%20%20def%20gradient_descent(%0A%20%20%20%20%20%20%20%20function%3A%20sm.Expr%2C%0A%20%20%20%20%20%20%20%20symbols%3A%20list%5Bsm.Symbol%5D%2C%0A%20%20%20%20%20%20%20%20x0%3A%20dict%5Bsm.Symbol%2C%20float%5D%2C%0A%20%20%20%20%20%20%20%20learning_rate%3A%20float%20%3D%200.1%2C%0A%20%20%20%20%20%20%20%20iterations%3A%20int%20%3D%20100%2C%0A%20%20%20%20)%20-%3E%20dict%5Bsm.Symbol%2C%20float%5D%20or%20None%3A%0A%20%20%20%20%20%20%20%20%22%22%22%0A%20%20%20%20%20%20%20%20Performs%20gradient%20descent%20optimization%20to%20find%20the%20minimum%20of%20a%20given%20function.%0A%0A%20%20%20%20%20%20%20%20Args%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20function%20(sm.Expr)%3A%20The%20function%20to%20be%20optimized.%0A%20%20%20%20%20%20%20%20%20%20%20%20symbols%20(list%5Bsm.Symbol%5D)%3A%20The%20symbols%20used%20in%20the%20function.%0A%20%20%20%20%20%20%20%20%20%20%20%20x0%20(dict%5Bsm.Symbol%2C%20float%5D)%3A%20The%20initial%20values%20for%20the%20symbols.%0A%20%20%20%20%20%20%20%20%20%20%20%20learning_rate%20(float%2C%20optional)%3A%20The%20learning%20rate%20for%20the%20optimization.%20Defaults%20to%200.1.%0A%20%20%20%20%20%20%20%20%20%20%20%20iterations%20(int%2C%20optional)%3A%20The%20maximum%20number%20of%20iterations.%20Defaults%20to%20100.%0A%0A%20%20%20%20%20%20%20%20Returns%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20dict%5Bsm.Symbol%2C%20float%5D%20or%20None%3A%20The%20solution%20found%20by%20the%20optimization%2C%20or%20None%20if%20no%20solution%20is%20found.%0A%20%20%20%20%20%20%20%20%22%22%22%0A%20%20%20%20%20%20%20%20x_star%20%3D%20%7B%7D%0A%20%20%20%20%20%20%20%20x_star%5B0%5D%20%3D%20np.array(list(x0.values()))%0A%0A%20%20%20%20%20%20%20%20x%20%3D%20%5B%5D%20%20%23%23%20Return%20x%20for%20visual!%0A%0A%20%20%20%20%20%20%20%20print(f%22Starting%20Values%3A%20%7Bx_star%5B0%5D%7D%22)%0A%0A%20%20%20%20%20%20%20%20for%20i%20in%20range(iterations)%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20x.append(dict(zip(x0.keys()%2C%20x_star%5Bi%5D)))%20%20%23%23%20Return%20x%20for%20visual!%0A%0A%20%20%20%20%20%20%20%20%20%20%20%20gradient%20%3D%20get_gradient(function%2C%20symbols%2C%20dict(zip(x0.keys()%2C%20x_star%5Bi%5D)))%0A%0A%20%20%20%20%20%20%20%20%20%20%20%20x_star%5Bi%20%2B%201%5D%20%3D%20x_star%5Bi%5D.T%20-%20learning_rate%20*%20gradient.T%0A%0A%20%20%20%20%20%20%20%20%20%20%20%20if%20np.linalg.norm(x_star%5Bi%20%2B%201%5D%20-%20x_star%5Bi%5D)%20%3C%2010e-5%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20solution%20%3D%20dict(zip(x0.keys()%2C%20x_star%5Bi%20%2B%201%5D))%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20print(%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20f%22%5CnConvergence%20Achieved%20(%7Bi%2B1%7D%20iterations)%3A%20Solution%20%3D%20%7Bsolution%7D%22%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20)%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20break%0A%20%20%20%20%20%20%20%20%20%20%20%20else%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20solution%20%3D%20None%0A%0A%20%20%20%20%20%20%20%20%20%20%20%20print(f%22Step%20%7Bi%2B1%7D%3A%20%7Bx_star%5Bi%2B1%5D%7D%22)%0A%0A%20%20%20%20%20%20%20%20return%20solution%2C%20x%0A%20%20%20%20return%20(gradient_descent%2C)%0A%0A%0A%40app.cell%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(%0A%20%20%20%20%20%20%20%20r%22%22%22%0A%20%20%20%20%20%20%20%20%23%23%20References%20%0A%0A%20%20%20%20%20%20%20%20%5B1%5D%20Snyman%2C%20J.%20A.%2C%20%26%20Wilke%2C%20D.%20N.%20(2019).%20Practical%20mathematical%20optimization%3A%20Basic%20optimization%20theory%20and%20gradient-based%20algorithms%20(2nd%20ed.).%20Springer.%0A%0A%20%20%20%20%20%20%20%20%5B2%5D%20%5BGradient%20Descent%20Wiki%20Page%5D(https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FGradient_descent)%0A%0A%20%20%20%20%20%20%20%20%5B3%5D%20%5BNewton's%20Method%20Wiki%20Page%5D(https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FNewton%2527s_method%23%3A~%3Atext%3DIn%2520numerical%2520analysis%252C%2520Newton%2527s%2520method%252C%2520also%2520known%2520as%2Croots%2520%2528or%2520zeroes%2529%2520of%2520a%2520real%2520-valued%2520function.)%0A%0A%20%20%20%20%20%20%20%20%3Cdiv%20style%3D%22text-align%3A%20center%3B%20font-size%3A%2024px%3B%22%3E%E2%9D%96%E2%9D%96%E2%9D%96%3C%2Fdiv%3E%0A%0A%20%20%20%20%20%20%20%20%3Ccenter%3E%0A%20%20%20%20%20%20%20%20Access%20all%20the%20code%20via%20this%20Marimo%20Notebook%20or%20my%20%5BGitHub%20Repo%5D(https%3A%2F%2Fgithub.com%2Fjakepenzak%2Fblog-posts)%0A%0A%20%20%20%20%20%20%20%20I%20appreciate%20you%20reading%20my%20post!%20My%20posts%20primarily%20explore%20real-world%20and%20theoretical%20applications%20of%20econometric%20and%20statistical%2Fmachine%20learning%20techniques%2C%20but%20also%20whatever%20I%20am%20currently%20interested%20in%20or%20learning%20%F0%9F%98%81.%20At%20the%20end%20of%20the%20day%2C%20I%20write%20to%20learn!%20I%20hope%20to%20make%20complex%20topics%20slightly%20more%20accessible%20to%20all.%0A%20%20%20%20%20%20%20%20%3C%2Fcenter%3E%0A%20%20%20%20%20%20%20%20%22%22%22%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0Aif%20__name__%20%3D%3D%20%22__main__%22%3A%0A%20%20%20%20app.run()%0A