目录
Cursor 简介 下载地址: 使用技巧: CHAT: example 1: 注意: example 2: Github Copilot 官网 简介 以插件方式安装 pycharm 自动写代码 example 1:写一个mysql取数据的类 example 2:写一个多重共线性检测的类 总结Cursor
简介
Cursor is an editor made for programming with AI. It’s early days, but right now Cursor can help you with a few things…
Write: Generate 10-100 lines of code with an AI that’s smarter than Copilot Diff: Ask the AI to edit a block of code, see only proposed changes Chat: ChatGPT-style interface that understands your current file And more: ask to fix lint errors, generate tests/comments on hover, etc下载地址:
https://www.cursor.so/
使用技巧:
https://twitter.com/amanrsanger
CHAT:
example 1:
注意:
对于上面最后一张图的中的代码,如果直接在IDE里面运行是不会报错的,但是有一句代码
vif["VIF"] = [variance_inflation_factor(df.values, i) for i in range(df.shape[1]-1)]
是不符合多重共线性分析或者VIF的数学原理的。因为VIF是对自变量间线性关系的分析,如果直接调用OLS;如果把OLS里面的目标函数换成非线性方程,就是表达的非线性关系。而上面的代码是把df.values都传入了variance_inflation_factor函数,包括了自变量和因变量,因此是不符合多重共线性分析原理的。
所以应改成:
import pandas as pd
data = {'x1': [1, 2, 3, 4, 5],
'x2': [2, 4, 6, 8, 10],
'x3': [3, 6, 9, 12, 15],
'y': [2, 4, 6, 8, 10]}
df = pd.DataFrame(data)
from statsmodels.stats.outliers_influence import variance_inflation_factor
# Get the VIF for each feature
vif = pd.DataFrame()
vif["feature"] = df.columns[:-1]
# vif["VIF"] = [variance_inflation_factor(df.values, i) for i in range(df.shape[1]-1)]
vif["VIF"] = [variance_inflation_factor(df.values[:, :-1], i) for i in range(df.shape[1]-1)]
# Print the results
print(vif)
原理解释:
def variance_inflation_factor(exog, exog_idx):
"""
Variance inflation factor, VIF, for one exogenous variable
The variance inflation factor is a measure for the increase of the
variance of the parameter estimates if an additional variable, given by
exog_idx is added to the linear regression. It is a measure for
multicollinearity of the design matrix, exog.
One recommendation is that if VIF is greater than 5, then the explanatory
variable given by exog_idx is highly collinear with the other explanatory
variables, and the parameter estimates will have large standard errors
because of this.
Parameters
----------
exog : {ndarray, DataFrame}
design matrix with all explanatory variables, as for example used in
regression
exog_idx : int
index of the exogenous variable in the columns of exog
Returns
-------
float
variance inflation factor
Notes
-----
This function does not save the auxiliary regression.
See Also
--------
xxx : class for regression diagnostics TODO: does not exist yet
References
----------
https://en.wikipedia.org/wiki/Variance_inflation_factor
"""
k_vars = exog.shape[1]
exog = np.asarray(exog)
x_i = exog[:, exog_idx]
mask = np.arange(k_vars) != exog_idx
x_noti = exog[:, mask]
r_squared_i = OLS(x_i, x_noti).fit().rsquared
vif = 1. / (1. - r_squared_i)
return vif
example 2:
GPT-4太大写不了,给出的是调GPT-2的示例代码。
Github Copilot
官网
https://github.com/features/copilot
简介
GitHub Copilot uses the OpenAI Codex to suggest code and entire functions in real-time, right from your editor. Trained on billions of lines of code, GitHub Copilot turns natural language prompts into coding suggestions across dozens of languages. Don’t fly soloDevelopers all over the world use GitHub Copilot to code faster, focus on business logic over boilerplate, and do what matters most: building great software. Focus on solving bigger problems
Spend less time creating boilerplate and repetitive code patterns, and more time on what matters: building great software. Write a comment describing the logic you want and GitHub Copilot will immediately suggest code to implement the solution. Get AI-based suggestions, just for you
GitHub Copilot shares recommendations based on the project’s context and style conventions. Quickly cycle through lines of code, complete function suggestions, and decide which to accept, reject, or edit. Code confidently in unfamiliar territory
Whether you’re working in a new language or framework, or just learning to code, GitHub Copilot can help you find your way. Tackle a bug, or learn how to use a new framework without spending most of your time spelunking through the docs or searching the web.
目前有个人版、商业版和教育版三种,教育版里面学生账户比教师账户的免费功能更多,只需要一个学校邮箱和一卡通就可以认证学生账户或者教师账户,还是很方便。
以插件方式安装
pycharm
自动写代码
以下代码除了第一行注释都是copilot自动补全的
example 1:写一个mysql取数据的类
# 写一个mysql取数据的类
import pymysql
class Mysql:
def __init__(self,host,user,password,database,port=3306):
self.host = host
self.user = user
self.password = password
self.database = database
self.port = port
self.conn = pymysql.connect(host=self.host,user=self.user,password=self.password,database=self.database,port=self.port)
self.cursor = self.conn.cursor()
def fetchone(self,sql):
self.cursor.execute(sql)
return self.cursor.fetchone()
def fetchall(self,sql):
self.cursor.execute(sql)
return self.cursor.fetchall()
def close(self):
self.cursor.close()
self.conn.close()
def commit(self):
self.conn.commit()
def rollback(self):
self.conn.rollback()
def execute(self,sql):
self.cursor.execute(sql)
self.commit()
def executemany(self,sql,values):
self.cursor.executemany(sql,values)
self.commit()
example 2:写一个多重共线性检测的类
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
class Multicollinearity:
def __init__(self,df):
self.df = df
def get_vif(self):
vif = pd.DataFrame()
vif["VIF Factor"] = [self.get_vif_score(i) for i in range(self.df.shape[1])]
vif["features"] = self.df.columns
return vif
def get_vif_score(self,i):
y = self.df.iloc[:,i]
x = self.df.drop(self.df.columns[i],axis=1)
lr = LinearRegression()
lr.fit(x,y)
r2 = r2_score(y,lr.predict(x))
return 1/(1-r2)
总结
用工具自动写代码的时候,最好要用实际例子或实际数据检查一下,就算没有报错,对于数值计算最好也要debug跑一遍看它生成的代码是否符合你的描述或者数学原理。具体原因见上文的注意。