معرفی شرکت ها

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

Library for data cleaning operations

ویژگی	مقدار
سیستم عامل	-
نام فایل	cleanmydata-1.0.6
نام	cleanmydata
نسخه کتابخانه	1.0.6
نگهدارنده	[]
ایمیل نگهدارنده	[]
نویسنده	Pranav Bapat
ایمیل نویسنده	pranav.g33k@gmail.com
آدرس صفحه اصلی	-
آدرس اینترنتی	https://pypi.org/project/cleanmydata/
مجوز	MIT

## cleanmydata This library contains all the essential functions for data cleaning. It takes a list of data cleaning parameters and either a string or pandas dataframe as input Functions: 1) Remove new lines 2) Remove emails 3) Remove URLs 4) Remove hashtags (#hashtag) 5) Remove the string if it contains only numbers 6) Remove mentions (@user) 7) Remove retweets (RT...) 8) Remove text between the square brackets [ ] 9) Remove multiple whitespaces and replace with one whitespace 10) Replace characters with more than two occurrences and replace with one occurrence 11) Remove emojis 12) Count characters (only for dataframe; creates a new column) 13) Count words (only for dataframe; creates a new column) 14) Calculate average word length (only for dataframe; creates a new column) 15) Count stopwords (only for dataframe; creates two new columns, stowords and stopword_count) 16) Detect language (uses <a href="https://pypi.org/project/fasttext-langdetect/">fasttext-langdetect</a>) (only for dataframe; creates two new columns, lang and lang_prob) 17) Detect language (uses <a href="https://pypi.org/project/fasttext-langdetect/">fasttext-langdetect</a>) (only for dataframe; creates just one column with langauge and probability; takes less time) 18) ## How to install? <code>pip install cleanmydata</code> ## Parameters <ol> <li>lst (list) - List of data cleaning operations</li> <li>data (string or dataframe) - Data to be passed</li> <li>column (string) - Dataframe column on which operation to perform; only for dataframe</li> <li>save (boolean) - If you want to save the results in a new file</li> <li>name (string) - Name of the new file if save is True</li> </ol> ## Usage 1) Import the library <code>from cleanmydata.functions import clean_data</code> 2) Call the method clean_data, and pass the parameters as you wish. 3) By default, if the dataframe is passed, it drops all NA values (dropna) ## Examples 1) To remove emails and hashtags <code>mydata = "Hello folks. abc@example.com #hashtag"</code> <code>mydata = clean_data(lst=[2, 4], data=mydata)</code> <code>print(mydata)</code> 2) To count stopwords, remove mentions, and URLs, and save file from a dataframe <code>df = pd.read_csv('data/my_csv.csv', encoding='ISO-8859-1', dtype='unicode')</code> <code>df = clean_data(lst=[15, 6, 2], data=df, column='comments', save=True, name='my custome file name')</code> ## Other notes If using stopwords, make sure you have en_core_web_sm installed. <code>python -m spacy download en_core_web_sm</code> ### More options and enhancements coming soon...

نیازمندی

مقدار	نام
-	pandas
-	numpy
-	spacy
-	contextualSpellCheck
-	fasttext-langdetect

نحوه نصب

نصب پکیج whl cleanmydata-1.0.6:

pip install cleanmydata-1.0.6.whl

نصب پکیج tar.gz cleanmydata-1.0.6:

pip install cleanmydata-1.0.6.tar.gz