معرفی شرکت ها

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

Automatic differentiation package AC207

ویژگی	مقدار
سیستم عامل	-
نام فایل	autodiff-djw-0.1.3
نام	autodiff-djw
نسخه کتابخانه	0.1.3
نگهدارنده	[]
ایمیل نگهدارنده	[]
نویسنده	David Chataway, Javiera Astudillo, Wenhan Zhang
ایمیل نویسنده	jastudillo@g.harvard.edu
آدرس صفحه اصلی	https://github.com/DJ-WJ/cs107-FinalProject
آدرس اینترنتی	https://pypi.org/project/autodiff-djw/
مجوز	-

[![codecov](https://codecov.io/gh/DJ-WJ/cs107-FinalProject/branch/master/graph/badge.svg?token=C38YL2T5X5)](https://codecov.io/gh/DJ-WJ/cs107-FinalProject) [![Build Status](https://travis-ci.com/DJ-WJ/cs107-FinalProject.svg?token=q44Rdg7inB6rSpQTpr1j&branch=master)](https://travis-ci.com/DJ-WJ/cs107-FinalProject) # cs107-FinalProject ## Group 26 * David Chataway * Javiera Astudillo * Wenhan Zhang ## Introduction Our library, AutoDiff, provides an automatic differentiation (AD) tool that is designed to be very user-friendly. The base autodiff class contains modules for both forward and reverse mode. A user provides an array of values and multivariate lambda functions, specifies the mode and a seed vector (optional) and is able to compute the jacobian, the gradients and the directional derivative with respect to a custom direction so that other users will be able to use the library in other algorithms like Newton’s method. Developing a user-friendly and efficient AD library is important for two primary reasons: 1. Automatic differentiation can improve upon symbolic differentiation offered by symbolic libraries (like SymPy) or by web services (like Wolfram Alpha) because AD methods can achieve machine precision while executing faster because AD does not need to carry through the whole expression tree and doesn’t suffer from exponentially longer derivative expressions (due to for example multiple product rules). 2. Efficient automatic differentiation is necessary to optimize computational cost and time due to different scaling between forward and reverse modes and the storing of intermediate variables and expressions. This is crucial in many applications such as optimization algorithms (e.g. gradient descent). ## Background In our understanding, at the heart of AD is the idea that functional expressions can be broken down into elementary symbolic expressions and the numerical result of the function’s derivatives can be calculated by storing the value and derivative value at intermediate variables of those expressions. The graph structure of automatic differentiation arises because these intermediate variables can have both parents and children that each point to different steps in the evaluation trace. Furthermore, many of these intermediate variables will have multiple children - i.e. they will be used more than once in the evaluation trace. Although forward and reverse mode differ in terms of the order in which the graph is evaluated, both modes 1) produce the same graph, 2) evaluate the derivatives at each child node by using a stored derivative value with respect to the parent node(s) and 3) take a function of dimension n (number of input variables) and produce its numerical derivative, in the form of a Jacobian (or its transpose) of dimension m (number of outputs). <img src="imgs/jacobian.PNG" width="400"> ### Forward mode In the forward mode, we calculate the partial derivatives by evaluating the chain rule for each intermediate variable during the forward pass. This is done by both applying a symbolic derivative of the intermediate variable and directly applying the derivative value of the parent node(s). For instance, for an intermediate variable expressed as a function of a function (its parent node), h(u(t)), the chain rule is evaluated as follows: <img src="imgs/forward_formula.PNG" width="100"> where the first term of the multiplication is evaluated symbolically and the second term is taken from the derivative value of the parent node. Because of the graph structure, the forward mode will be able to compute the partial derivative of all of the n outputs with respect to a single input variable (if seeded in the unit direction of that specific input variable). That is, each forward pass produces a single column of the Jacobian matrix. ### Reverse mode In the reverse mode, rather than directly computing the chain rule at each intermediate variable, you just store the partial derivatives and the dependencies of the expression tree during the forward pass. Then, you do a reverse pass by starting at a single output variable, ‘seeding it’ (e.g. set the derivative of the output variable to 1 and the derivatives of the other output variables to 0) and propagating backwards to all the input variables. We are essentially asking, ‘the output variable affect can affect which input variables’? This is possible by calculating the ‘adjoint’ which stores the current value of the partial derivative of the output with respect to the variable at node i according to the following formula, where j is a child of i. <img src="imgs/reverse_formula_1.PNG" width="175"> However, in order to calculate the derivative of a different output variable, we need to re-run the program again with a different unit seed. As a result, each run of the reverse mode will generate a single row of the Jacobian (or a column of the transpose of the Jacobian). #### Example For example, consider the function f(x,y), shown below (same as the one in pair programming 6). <img src="imgs/Example.PNG" width="250"> Rather than storing the entire symbolic derivative expression, we can instead break down the function into intermediate variables (shown below) and store (or overload) the symbolic derivatives of the elemental operations of +/-, \*\*2, exp(), sin() and cos(). For forward mode, all that needs to be stored is a tuple of the value and the derivative value according to a particular seed. Thus, this evaluation would need to be run twice for the two seed vectors corresponding to each input variable. <img src="imgs/forward_example_trace.PNG" width="350"> If this automatic differentiation were to be evaluated in reverse mode, on the forward pass at each node we would instead store the values of the intermediate variable partial derivatives with respect to each parent argument (rather that applying the chain rule) and then we would work backwards in one pass to calculate df/dx and df/dy using the adjoint method. These graph structures are outlined below: <img src="imgs/graph_structure.PNG" width="900"> ## How to use Our solution is provided as a Python package called `AutoDiff` and is available through [`pip`](https://pypi.org/project/pip/), a package installer tool for Python. The prerequisite to use `AutoDiff` is to have [`Python`](https://www.python.org/downloads/) installed as well as [`numpy`](https://numpy.org/install/) and [`scipy`](https://www.scipy.org/install.html) packages installed. If you install `AutoDiff` through `pip` then the dependencies will be automatically installed (`numpy` and `scipy`). ### PIP ``` python3 -m pip install autodiff-djw ``` ### Create `AutoDiff` The main access point to our library is through the `AutoDiff` class. To instantiate an object of this class the user must provide the following arguments: `AutoDiff(f, vals, mode='forward')` * `f`: *array_like*   List-like output functions. Order preserving. * `vals`: *array_like*   List-like initital values for inputs. All output functions will be evaluated at this values. * `mode`: *string (default=`forward`)*   Either `forward` or `backward` mode. ***Example: new `AutoDiff`*** ``` from autodiff import AutoDiff inputs_vals = [np.pi/2, np.pi/4, np.pi/8] f1 = ... f2 = ... f = [f1, f2] autodiff = AutoDiff(vals=inputs_vals, f=f, mode='forward') ``` ### Function creation To create functions (`[f1,f2]` in the previous example) that are compatible with our library methods (i.e implements automatic differentiation of its outputs) you must provide regular python functions which: (1) meet the signature as described following and (2) use functions supported by our library. #### Function signature The elements of `f` (provided at `AutoDiff` instantiation) must be regular python functions which meet the following signature: ``` f1(inputs) Arguments --------- inputs: array_like A sequence representing the inputs. Each element represents one input variable (scalar). Returns ------- output: numeric (scalar) A function of the inputs. ``` #### Supported operations To enable automatic differentiation features, any user defined function must use the library provided functions: * Basic Operations: * Addition (commutative) (`+`) * Subtraction (`-`) * Multiplication (commutative) (`*`) * Division (`/`) * Power (`**`) * Negation (`-`) * Elementary Functions: * `log` (`base` configurable) * `exp` * `sin` * `cos` * `tan` * `arcsin` * `arccos` * `arctan` * `sinh` * `cosh` * `tanh` * `sqrt` * `expit` These functions may be accesed through the `autodiff.elementary` module and the user should import them before declaring a function. All of the elementary functions meet the same signature as `numpy` equivalent functions. ***Example: declaring functions*** ``` from autodiff.elementary import sin f1 = lambda inputs: inputs[0]*inputs[2]**2 f2 = lambda inputs: inputs[2]*sin(inputs[1]) f = [f1, f2] ``` or equivalently: ``` def f1(inputs): x1 = inputs[0] # first input x3 = inputs[2] # second input return x1*(x3**2) def f2(inputs): x2 = inputs[1] # first input x3 = inputs[2] # second input return x2*sin(x3) f = [f1, f2] ``` ### `AutoDiff` attributes and methods #### Attributes * `inputs`: get the user provided inputs values. * `outputs`: get the output of the user provided functions, evaluated at `inputs`. * `jacobian`: get the jacobian. #### Methods * **dir_der** ``` dir_der(seed) Calculates the derivative of each output function in the direction of `dir`=`seed`. Arguments --------- seed: array-like (M,) Represents the direction to calculate the derivative of each output function. Must have the same dimension as the number of inputs. M: number of inputs Returns ------- list (N,) Derivative of each input function in the direction of `seed` vector. N: number of functions ``` * **gradient** ``` gradient(fn_idx) Creates or pulls a row of the Jacobian. Calculates the derivative of the output function `fn_index` with respect to each input variable. Arguments -------- fn_idx: int Index of the output function to calculate the gradient. Returns ------- list (M,) M: number of inputs. ``` ### Demo Let's take the following vector function to illustrate the usage of our library: <img src="imgs/demo_formula.png" width="950"> Now we'll evaluate this function at the point x = [pi/2, pi/4, pi/8] and evaluate the directional derivative of each output function with respect to v = [1,2,1]: <img src="imgs/demo_result.png" width="700"> This analytical solution may be approximated through the `AutoDiff` library through either the forward or reverse modes as follows. It can be seen that the returned derivatives are equal to each other (within machine precision) and to the expected value. #### Forward Mode ``` import numpy as np from autodiff import AutoDiff from autodiff.elementary import log, exp, sin, cos, tan, arcsin, arccos, arctan, sinh, cosh, tanh, sqrt, expit inputs_vals = [np.pi/2, np.pi/4, np.pi/8] seed = [1,2,1] f1 = lambda inputs: inputs[0]*inputs[2]**2 f2 = lambda inputs: inputs[2]*sin(inputs[1]) f = np.array([f1, f2]) autodiff = AutoDiff(f=f, vals=inputs_vals, mode='forward') print(autodiff.inputs) >> [1.5707963267948966, 0.7853981633974483, 0.39269908169872414] print(autodiff.outputs) >> [0.24223653656484231, 0.2776801836348979] print(autodiff.jacobian) >> [[0.15421257 0. 1.23370055] [0. 0.27768018 0.70710678]] print(autodiff.gradient(fn_idx=0)) >> [0.15421257 0. 1.23370055] print(autodiff.dir_der(seed=seed)) [1.38791312 1.26246715] ``` #### Reverse Mode ``` import numpy as np from autodiff import AutoDiff from autodiff.elementary import log, exp, sin, cos, tan, arcsin, arccos, arctan, sinh, cosh, tanh, sqrt, expit inputs_vals = [np.pi/2, np.pi/4, np.pi/8] seed = [1,2,1] f1 = lambda inputs: inputs[0]*inputs[2]**2 f2 = lambda inputs: inputs[2]*sin(inputs[1]) f = np.array([f1, f2]) autodiff = AutoDiff(f=f, vals=inputs_vals, mode='reverse') print(autodiff.inputs) >> [1.5707963267948966, 0.7853981633974483, 0.39269908169872414] print(autodiff.outputs) >> [0.24223653656484231, 0.2776801836348979] print(autodiff.jacobian) >> [[0.15421257 0. 1.23370055] [0. 0.27768018 0.70710678]] print(autodiff.gradient(fn_idx=0)) >> [0.15421257 0. 1.23370055] print(autodiff.dir_der(seed=seed)) [1.38791312 1.26246715] ``` ## Software organization The current structure of our project is as following: ``` . ├── src │ ├── autodiff │ │ ├── autodiff_.py │ │ ├── elementary.py │ │ ├── forward.py │ │ ├── reverse.py │ │ └── ... │ ├── demo │ └── tests └── docs ├── README.md └── imgs ``` `forward.py` and `reverse.py` modules contain the core functions to use our `AutoDiff` package in forward and reverse mode correspondingly. `elementary.py` provides elementary math functions to use in conjunction with our library while `autodiff_.py` contains the `AutoDiff` class for accesing the auto-differentiation methods of our library. All our tests are located in `src/tests` which are integrated with Travis CI and Codecov. To run the test suite you'll need to install `pytest` and `codecov`: ``` pip install pytest pytest-cov pip install codecov ``` Once both previous packages are installed you may run the test suite locally (starting at the root of our folder structure): ``` cd . export PYTHONPATH="$PWD/src" python -m pytest --cov src/tests ``` ## Implementation details There's a common interface to access the automatic differentiation tools provided by this library: `AutoDiff` class. Under the hood, `AutoDiff` implements a factory pattern at instantiation that returns either an `AutoDiffForward` or `AutoDiffReverse` object. This pattern was selected so that the the usage of the `AutoDiff` object was transparent to the user with regard to the implementation of the auto-differentiation mode used. #### Forward Mode 1) A "`Variable`" class was created with `.val` and `.der` attributes and overloaded elementary dunder methods and elementary functions 2) AutoDiffForward class automatically creates `Variable` objects according to the input array. 3) AutoDiffForward class creates the Jacobian (`.jacobian`) by calling the directional derivative (`dir_dev`) method on unit seed vectors and iterating through the size `M` (number of input variables) 4) AutoDiffForward class `gradient` method requires the creation of the Jacobian and returns a row. Let's look at a sample function `sin` within `Variable` class to understand better how it implements the auto-differentiation underneath: ``` class Variable: def __init__(self,**kwargs): ... self.val = kwargs['val'] self.der = kwargs['der'] ... def sin(a): out = Variable() try: out.val = np.sin(a.val) out.der = np.cos(a.val)*a.der except AttributeError: try: out.val = np.sin(a) except Exception as e: raise TypeError('Sin failed: `input` not a `Variable` object or a real number.') return out ``` For the forward calculation you need to first seed the value of the derivative at the very beginning `self.der = kwargs['der']`. Once you've done that, any call to a mathematical function passing a `Variable` object as argument willl inmediately return a `Variable` object with the updated `val` and `derivative` with no need to keep track of the forward pass. For the case of `sin` we know that the output value is `np.sin(a.val)` while it's partial derivative will be `cos(a.val)*a.der` following the chain rule. #### Reverse Mode 1) A "`Node`" class was created with `.val`, `.grad_value` and `.children` ([partial der, child object]) attributes and overloaded elementary dunder methods and elementary functions that update .children - A `grad()` method was created that recursively sums the (partial der * adjoint) for all children of the parent Node (thus acting as the backwards pass) 2) AutoDiffReverse class automatically creates Node objects according to the input array. 3) AutoDiffReverse class creates the `Jacobian` by calling the gradient method on unit seed vectors and iterating through the size `N` (number of output functions). 4) AutoDiffReverse class directional derivative (`dir_dev`) method requires the creation of the `Jacobian` and returns a matrix multiplication of the Jacobian times the custom vector. Let's look at the same sample function `sin` but within `Node` class this time: ``` class Node: def __init__(self, **kwargs): self.val = 0 self.children = [] self.grad_value = None #the adjoint ... def grad(self): # recurse only if the value is not yet cached if self.grad_value is None: # recursively calculate derivative (adjoint) using chain rule self.grad_value = sum(weight * node.grad() for weight, node in self.children) return self.grad_value def gradient(self, fn_idx): ... inputs_vars = [Node(val=val) for val in self.vals] # create Node inputs f_vars = np.array([f_(inputs_vars) for f_ in self.f]) # Seed with .grad_value = 1 only for the function at fn_idx and all all others .grad_value = 0 for i in range(len(f_vars)): if i == fn_idx: f_vars[i].grad_value = 1.0 else: f_vars[i].grad_value = 0 # Get the partial derivative wrt each input der = [inputs_.grad() for inputs_ in inputs_vars] return der def sin(a): out = Node() try: out.val = np.sin(a.val) a.children.append((np.cos(a.val), out)) # weight = ∂z/∂self = cos(self) except AttributeError: try: out.val = np.sin(a) except Exception as e: raise TypeError('Sin failed: `input` not an `Node` object or a real number.') return out ``` Note that for calculating the gradient through the reverse mode we need to first append the partial derivatives and the dependencies of the expression tree during the forward pass. This is done when calling `f_vars = np.array([f_(inputs_vars) for f_ in self.f])` within `gradient` which will pass through our custom functions such as `sin`. Then, you do a reverse pass by starting at a single output variable, ‘seeding it’ (e.g. set the derivative of the output variable to 1 and the derivatives of the other output variables to 0). This is executed at lines `f_vars[i].grad_value = ...`. After seeding, you propagate backwards to all the input variables (this is done when calling for `der = [inputs_.grad() for inputs_ in inputs_vars]`). ## Our extension Our proposed extension includes a working `reverse` mode implementation. At user level, it exposes the same methods and attributes as the forward mode and the usage is transparent for the user (doesn't realize any change in the usage other than setting the mode to `backward`). More specifically, the user may instantiate an `AutoDiff` object through the following syntax (similar to the `forward` syntax): ```autodiff = AutoDiff(vals=inputs_vals, f=f, mode='reverse')``` The user may access the following methods and attributes from this class: * `inputs` * `outputs` * `jacobian` * `dir_der(seed)` * `gradient(fn_idx)` Which represent exactly the same values as for the `forward` mode described in previous sections. It's usage is also exactly the same as in the `forward` mode. All background content necessary to understand the `reverse` mode may be found in `Background > Reverse mode` section while it's usage it's described throughout all previous decoumentation alongside `forward` usage. ## Broader Impact and Inclusivity Statement We created this library to be as user-friendly as possible and to assist with common differentiation tasks like Newton's method and optimization problems. However, it is possible that as the library grows over time - in usage and functionality - that people may misuse our library. While we anticipate that initially the users will be primarily academics and hobbyists, what happens if the library becomes adopted by those with broad and significant societal impact? For instance, if the COMPAS tool - used to assess the likelihood of defendants becoming a recidivist - adopted our library to optimize their predictive models, it would become crucial to maintain the integrity of the library. As a result, we would need to devote more resources to testing, error management and debugging. Otherwise, the societal impacts of a erroneous automatic differentiation library could be vast. In making our AD as user-friendly as possible, we considered the perspectives of a varied audience and sought not to deliberately exclude individuals or groups based on different cultural, economical, social, gender and disability groups. For instance, we tried to refrain from using exclusive terms such as master. However, we are cognisant that our library is exclusive because of three core deficiencies: 1) our documentation and much of our code's functions are writen only in English, 2) our code and documentation makes reference to parent and child (nodes) which may be triggering to some, and 3) pull requests to the library can only be approved by select few, which may make others feel like they cannot contribute. Given more time, we would like to make our library as language-friendly as possible, by writing documentation in different languages and code that is less English-centric, 2) determine better terminology for the parent-child relationship, and 3) democratize the contribution reviewal and approval process. Finally, it is important to consider that there may be unanticipated exlusive characteristics of our library that will be revealed with broader usage. Thus, it is paramount that a forum be created for users to voice concerns and for there to be a standardized process to review those concerns and implement changes if needed. ## Future Features Future work on the library will remain focused on the original objective: to make an AD package that is as user-friendly as possible for our target users, primarily academics and hobbyists. As such, development work on future features is as follows: - Add more compatible functions primarily usage in academic research (e.g. absolute value, Gaussian) - Complement the test suite with tests for the additional functions - Print run time and/or computational intensity of the class methods - Initial ideas include implementing the time module and using the time.clock() method - Add logic to decide on using Forward or Reverse mode based on m, n dimensionality - Print the suggestion to the user to help with useability - Create a graphical user interface (GUI) to execute the library - This could be a done by hosting a public server similar to autoed.herokuapp.com - Create a cache to improve runtime (e.g. data structure to track stored intermediate variables on independent passes) - Initial ideas for the data structures include a disctionary or graph

نیازمندی

مقدار	نام
-	numpy
-	scipy

زبان مورد نیاز

مقدار	نام
>=3.6	Python

نحوه نصب

نصب پکیج whl autodiff-djw-0.1.3:

pip install autodiff-djw-0.1.3.whl

نصب پکیج tar.gz autodiff-djw-0.1.3:

pip install autodiff-djw-0.1.3.tar.gz