本文共 1441 字,大约阅读时间需要 4 分钟。
?????????Python???????????
??????????????????????????????????Python???????????????????????????????????????Python??????
?????????????????????????????????????????????????????????????????
?????????????????
???????????????????????????????????????????
?????????????????????????????
????????????????????????????????
????????????????????????
Python???????????????????????????????????????????????
???????CSV???????????????????????????????????
?????
import pandas as pddf = pd.read_csv('sales_data.csv') ??????
df['price'].fillna('??', inplace=True)# ??df.dropna(inplace=True) ??????
def z_score(x): return (x - np.mean(x)) / np.std(x)df['price'].apply(z_score)
???????
df.drop_duplicates(inplace=True)
???????
from sklearn.feature_extraction.text import TfidfVectorizervectorizer = TfidfVectorizer(max_features=100)tfidf = vectorizer.fit_transform(df['??'])
?????????
df.to_csv('cleaned_data.csv', index=False) ????????????????Python???????????????????????????????????????
转载地址:http://itsfk.baihongyu.com/