معرفی شرکت ها

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

Python interface to StellarDB

ویژگی	مقدار
سیستم عامل	OS Independent
نام فایل	PyStellarDB-0.11.0
نام	PyStellarDB
نسخه کتابخانه	0.11.0
نگهدارنده	[]
ایمیل نگهدارنده	[]
نویسنده	Zhiping Wang
ایمیل نویسنده	zhiping.wang@transwarp.io
آدرس صفحه اصلی	https://github.com/WarpCloud/PyStellarDB
آدرس اینترنتی	https://pypi.org/project/PyStellarDB/
مجوز	Apache License, Version 2.0

PyStellarDB =========== PyStellarDB is a Python API for executing Transwarp Exetended OpenCypher(TEoC) and Hive query. It could also generate a RDD object which could be used in PySpark. It is base on PyHive(https://github.com/dropbox/PyHive) and PySpark(https://github.com/apache/spark/) PySpark RDD =========== We hack a way to generate RDD object using the same method in `sc.parallelize(data)`. It could cause memory panic if the query returns a large amount of data. Users could use a workaround if you do need huge data: 1. If you are querying a graph, refer to StellarDB manual of Chapter 4.4.5 to save the query data into a temporary table. 2. If you are querying a SQL table, save your query result into a temporary table. 3. Find the HDFS path of the temporary table generated in Step 1 or Step 2. 4. Use API like `sc.newAPIHadoopFile()` to generate RDD. Usage ===== PLAIN Mode (No security is configured) --------------------------------------- .. code-block:: python from pystellardb import stellar_hive conn = stellar_hive.StellarConnection(host="localhost", port=10000, graph_name='pokemon') cur = conn.cursor() cur.execute('config query.lang cypher') cur.execute('use graph pokemon') cur.execute('match p = (a)-[f]->(b) return a,f,b limit 1') print cur.fetchall() LDAP Mode --------- .. code-block:: python from pystellardb import stellar_hive conn = stellar_hive.StellarConnection(host="localhost", port=10000, username='hive', password='123456', auth='LDAP', graph_name='pokemon') cur = conn.cursor() cur.execute('config query.lang cypher') cur.execute('use graph pokemon') cur.execute('match p = (a)-[f]->(b) return a,f,b limit 1') print cur.fetchall() Kerberos Mode ------------- .. code-block:: python # Make sure you have the correct realms infomation about the KDC server in /etc/krb5.conf # Make sure you have the correct keytab file in your environment # Run kinit command: # In Linux: kinit -kt FILE_PATH_OF_KEYTABL PRINCIPAL_NAME # In Mac: kinit -t FILE_PATH_OF_KEYTABL -f PRINCIPAL_NAME from pystellardb import stellar_hive conn = stellar_hive.StellarConnection(host="localhost", port=10000, kerberos_service_name='hive', auth='KERBEROS', graph_name='pokemon') cur = conn.cursor() cur.execute('config query.lang cypher') cur.execute('use graph pokemon') cur.execute('match p = (a)-[f]->(b) return a,f,b limit 1') print cur.fetchall() Execute Hive Query ------------------ .. code-block:: python from pystellardb import stellar_hive # If `graph_name` parameter is None, it will execute a Hive query and return data just as PyHive does conn = stellar_hive.StellarConnection(host="localhost", port=10000, database='default') cur = conn.cursor() cur.execute('SELECT * FROM default.abc limit 10') Execute Graph Query and change to a PySpark RDD object ------------------------------------------------------ .. code-block:: python from pyspark import SparkContext from pystellardb import stellar_hive sc = SparkContext("local", "Demo App") conn = stellar_hive.StellarConnection(host="localhost", port=10000, graph_name='pokemon') cur = conn.cursor() cur.execute('config query.lang cypher') cur.execute('use graph pokemon') cur.execute('match p = (a)-[f]->(b) return a,f,b limit 10') rdd = cur.toRDD(sc) def f(x): print(x) rdd.map(lambda x: (x[0].toJSON(), x[1].toJSON(), x[2].toJSON())).foreach(f) # Every line of this query is in format of Tuple(VertexObject, EdgeObject, VertexObject) # Vertex and Edge object has a function of toJSON() which can print the object in JSON format Execute Hive Query and change to a PySpark RDD object ----------------------------------------------------- .. code-block:: python from pyspark import SparkContext from pystellardb import stellar_hive sc = SparkContext("local", "Demo App") conn = stellar_hive.StellarConnection(host="localhost", port=10000) cur = conn.cursor() cur.execute('select * from default_db.default_table limit 10') rdd = cur.toRDD(sc) def f(x): print(x) rdd.foreach(f) # Every line of this query is in format of Tuple(Column, Column, Column) Dependencies ============ Required: ------------ - Python 2.7+ / Less than Python 3.7 System SASL ------------ Different systems require different packages to be installed to enable SASL support. Some examples of how to install the packages on different distributions follow. Ubuntu: .. code-block:: bash apt-get install libsasl2-dev libsasl2-2 libsasl2-modules-gssapi-mit apt-get install python-dev gcc #Update python and gcc if needed RHEL/CentOS: .. code-block:: bash yum install cyrus-sasl-md5 cyrus-sasl-plain cyrus-sasl-gssapi cyrus-sasl-devel yum install gcc-c++ python-devel.x86_64 #Update python and gcc if needed # If your Python environment is 3.X, then you may need to compile and reinstall Python # if pip3 install fails with a message like 'Can't connect to HTTPS URL because the SSL module is not available' # 1. Download a higher version of openssl, e.g: https://www.openssl.org/source/openssl-1.1.1k.tar.gz # 2. Install openssl: ./config && make && make install # 3. Link openssl: echo /usr/local/lib64/ > /etc/ld.so.conf.d/openssl-1.1.1.conf # 4. Update dynamic lib: ldconfig -v # 5. Download a Python source package # 6. vim Modules/Setup, search '_socket socketmodule.c', uncomment # _socket socketmodule.c # SSL=/usr/local/ssl # _ssl _ssl.c \ # -DUSE_SSL -I$(SSL)/include -I$(SSL)/include/openssl \ # -L$(SSL)/lib -lssl -lcrypto # # 7. Install Python: ./configure && make && make install Windows: .. code-block:: bash # There are 3 ways of installing sasl for python on windows # 1. (recommended) Download a .whl version of sasl from https://www.lfd.uci.edu/~gohlke/pythonlibs/#sasl # 2. (recommended) If using anaconda, use conda install sasl. # 3. Install Microsoft Visual C++ 9.0/14.0 buildtools for python2.7/3.x, then pip install sasl(under test). Notices ======= If you install pystellardb >= 0.9, then it will install a beeline command into system. Delete /usr/local/bin/beeline if you don't need it. Requirements ============ Install using - ``pip install 'pystellardb[hive]'`` for the Hive interface. PyHive works with - For Hive: `HiveServer2 <https://cwiki.apache.org/confluence/display/Hive/Setting+up+HiveServer2>`_ daemon Windows Kerberos Configuration ============================== If you're connecting to databases using Kerberos authentication from Windows platform, you'll need to install & configure Kerberos for Windows first. Get it from http://web.mit.edu/kerberos/dist/ After installation, configure the environment variables. Make sure your Kerberos variable is set ahead of JDK variable(If you have JDK), because JDK also has kinit etc. Find /etc/krb5.conf on your KDC, copy it into krb5.ini on Windows with some modifications. e.g.(krb5.conf on KDC): .. code-block:: bash [logging] default = FILE:/var/log/krb5libs.log kdc = FILE:/var/log/krb5kdc.log admin_server = FILE:/var/log/kadmind.log [libdefaults] default_realm = DEFAULT dns_lookup_realm = false dns_lookup_kdc = false ticket_lifetime = 24h renew_lifetime = 7d forwardable = true allow_weak_crypto = true udp_preference_limit = 32700 default_ccache_name = FILE:/tmp/krb5cc_%{uid} [realms] DEFAULT = { kdc = host1:1088 kdc = host2:1088 } Modify it, delete [logging] and default_ccache_name in [libdefaults]: .. code-block:: bash [libdefaults] default_realm = DEFAULT dns_lookup_realm = false dns_lookup_kdc = false ticket_lifetime = 24h renew_lifetime = 7d forwardable = true allow_weak_crypto = true udp_preference_limit = 32700 [realms] DEFAULT = { kdc = host1:1088 kdc = host2:1088 } This is your krb5.ini for Windows Kerberos. Put it at those 3 places: C:\ProgramData\MIT\Kerberos5\krb5.ini C:\Program Files\MIT\Kerberos\krb5.ini C:\Windows\krb5.ini Finally, configure hosts at: C:/Windows/System32/drivers/etc/hosts Add ip mappings of host1, host2 in the previous example. e.g. .. code-block:: bash 10.6.6.96 host1 10.6.6.97 host2 Now, you can run kinit in the command line! Testing ======= On his way

نیازمندی

مقدار	نام
-	future
-	python-dateutil
-	pyhive
-	sasl
-	thrift
>=0.3.0	thrift-sasl
>=2.4.0	pyspark
>=0.2.1)	sasl
>=0.10.0)	thrift
>=0.12.0)	requests-kerberos
>=1.0.0)	requests
>=1.3.0)	sqlalchemy

زبان مورد نیاز

مقدار	نام
>=2.7,<=3.7	Python

نحوه نصب

نصب پکیج whl PyStellarDB-0.11.0:

pip install PyStellarDB-0.11.0.whl

نصب پکیج tar.gz PyStellarDB-0.11.0:

pip install PyStellarDB-0.11.0.tar.gz