![Logo](https://i.ibb.co/zZ88YRq/3ee77eca-5573-4591-b911-b0a01ea0ad3a-200x200.png)
[![Build Status](https://travis-ci.com/azfar154/fastTF.svg?token=f7cQs9ipscGj1qwuxd1Q&branch=master)](https://travis-ci.com/azfar154/fastTF)
fastTF is a easy way to convert a Pandas DataFrame into a Tensorflow TFRecord. Also with fastTF you will be able to get the example_spec.
### Why would you do so?
- With a TFRecord file you will be able to make your input pipeline faster
- Binary data takes up less space on disk, takes less time to copy and can be read much more efficiently from disk.
### Tech
fastTF uses a number of open source projects to work properly:
* [Tensorflow](https://www.tensorflow.org/) - "An end-to-end open source machine learning platform"
* [Pandas](https://pandas.pydata.org/) - "pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool,
built on top of the Python programming language."
### Installation
tfFast requires [Python](https://www.python.org/downloads/release/python-360/) 3.6 to run.
Install the necessary packages and dependencies
```sh
$ pip3 install tensorflow
$ pip3 install pandas
```
### Development
Want to contribute? Great!
fastTF uses Tensorflow + Pandas for fast development.
Fork these repository and change app.py.
Open your Terminal and run these commands to edit the files
```sh
$ cd fastTF
$ nano app.py
```
### Example
````
def test_function():
"""
Test the package
:return: if the program was successful.
>>> test_function()
True
"""
data = pd.read_csv('diabetes.csv')
test = tfRecordWriter(data)
test.write('new.tfrecords')
with open('example_spec.pickle','rb') as f:
example_spec = pickle.load(f)
assert example_spec == test.get_example_spec()
data = tf.data.TFRecordDataset('new.tfrecords')
func = lambda x: tf.io.parse_single_example(x,example_spec)
data = data.map(func)
y = data.take(1)
for x in y:
assert x['Age'].numpy() == 50
return True
````
### Metrics
### Memory Test
```sh
Memory Test
Line # Mem usage Increment Line Contents
================================================
1 import pandas as pd
2 from fastTF import tfRecordWriter
3 import tensorflow as tf
4 import pickle
5 import doctest
6 import pytest
7
8 300.7 MiB 300.7 MiB
9 301.0 MiB 0.2 MiB def test_function():
10 301.0 MiB 0.0 MiB """
11 301.0 MiB 0.0 MiB Test the package
12 >>> test_function()
13 301.0 MiB 0.0 MiB True
14 301.0 MiB 0.0 MiB
15 301.0 MiB 0.0 MiB """
16 data = pd.read_csv('diabetes.csv')
17 301.0 MiB 0.0 MiB test = tfRecordWriter(data)
18 301.0 MiB 0.0 MiB test.write('new.tfrecords')
19 301.0 MiB 0.0 MiB
20 301.0 MiB 0.0 MiB with open('example_spec.pickle','rb') as f:
21 301.3 MiB 0.2 MiB example_spec = pickle.load(f)
22 301.3 MiB 0.0 MiB assert example_spec == test.get_example_spec()
23
24 data = tf.data.TFRecordDataset('new.tfrecords')
25 func = lambda x: tf.io.parse_single_example(x,example_spec)
26 data = data.map(func)
27 y = data.take(1)
28 for x in y:
29 assert x['Age'].numpy() == 50
30 return True
````
### Speed Test
````sh
Timer unit: 1e-06 s
Total time: 0.644076 s
File: /notebooks/package/tests/test_sample.py
Function: test_function at line 8
Line # Hits Time Per Hit % Time Line Contents
==============================================================
8 def test_function():
9 1 6395.0 6395.0 1.0 data = pd.read_csv('diabetes.csv')
10 1 602.0 602.0 0.1 test = tfRecordWriter(data)
11 1 589870.0 589870.0 91.6 test.write('new.tfrecords')
12
13 1 57.0 57.0 0.0 with open('example_spec.pickle','rb') as f:
14 1 79.0 79.0 0.0 example_spec = pickle.load(f)
15 1 28.0 28.0 0.0 assert example_spec == test.get_example_spec()
16
17 1 8591.0 8591.0 1.3 data = tf.data.TFRecordDataset('new.tfrecords')
18 1 3.0 3.0 0.0 func = lambda x: tf.io.parse_single_example(x,example_spec)
19 1 25952.0 25952.0 4.0 data = data.map(func)
20 1 245.0 245.0 0.0 y = data.take(1)
21 2 12227.0 6113.5 1.9 for x in y:
22 1 27.0 27.0 0.0 assert x['Age'].numpy() == 50
````
### Another Example
```sh
>>> import pandas as pd
>>> data = pd.read_csv('diabetes.csv')
>>> from fastTF import tfRecordWriter
>>> demo = tfRecordWriter(data)
>>> demo.write("name.tfrecord")
>>> test.get_example_spec()
{'Pregnancies': FixedLenFeature(shape=(), dtype=tf.int64, default_value=None), 'Glucose', FixedLenFeature(shape=(), dtype=tf.int64, default_value=None), 'BloodPressure': FixedLenFeature(shape=(), dtype=tf.int64, default_value=None), 'SkinThickness': FixedLenFeature(shape=(), dtype=tf.int64, default_value=None), 'Insulin': FixedLenFeature(shape=(), dtype=tf.int64, default_value=None), 'Age': FixedLenFeature(shape=(), dtype=tf.int64, default_value=None), 'Outcome': FixedLenFeature(shape=(), dtype=tf.int64, default_value=None), 'BMI': FixedLenFeature(shape=(), dtype=tf.float32, default_value=None), 'DiabetesPedigreeFunction': FixedLenFeature(shape=(), dtype=tf.float32, default_value=None)}
```
### Todos
- Write more Tests
- Make the app faster