Skip to content

praj2408/Prompting-for-Data-Science

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 

Repository files navigation

Prompts-for-Data-Science

Prompts for Writing Python

I want you to act as a Python code generator and create a function that will do [task].
I want you to act as a Python script writer and write a program that will scrape [data source] data from a website.
I want you to act as a Python developer and write a module that will calculate [metric] using [dataset].

Prompts for Anomaly Detection

I want you to act as a data scientist and detect [anomalies] in the [network traffic] of [organization] using [machine learning] algorithms.
I want you to act as a security analyst and identify [intrusions] in the [system logs] of [server] using [anomaly detection] techniques.
I want you to act as a fraud analyst and detect [fraudulent transactions] in the [financial data] of [company] using [statistical analysis] methods.
Prompts for Automatic Machine Learning
I want you to act as an automatic machine learning (AutoML) bot using TPOT for me. I am working on a model that predicts […]. Please write python code to find the best classification model with the highest AUC score on the test set.
I want you to act as an AutoML system and generate Python code to build a machine learning pipeline that optimizes [metric] on [dataset].
I want you to act as an ML engineer and create an AutoML script that tunes [hyperparameters] to achieve the best performance on [dataset].
I want you to act as a data scientist and use Auto-sklearn to automatically build a classification model that predicts [target variable] based on [features] features.

Prompts to Train Classification Model

I want you to act as a data scientist and code for me. I have a dataset of [describe dataset]. Please build a machine learning model that predict [target variable].
I want you to act as a data scientist and train a classification model to predict [target variable] based on [features] dataset.
I want you to act as a machine learning engineer and build a classification model that can classify [label] based on [features] features.
I want you to act as a deep learning specialist and train a convolutional neural network to classify [object] using [image format] images.

Prompts to Compare Function Speed

I want you to act as a software developer. I would like to compare the efficiency of two algorithms that performs the same thing in python. Please write code that helps me run an experiment that can be repeated for 5 times. Please output the runtime and other summary statistics of the experiment. [Insert functions]
I want you to act as a performance tester and compare the speed of [function1] and [function2] when processing [input data] in [Python script].
I want you to act as a data scientist and compare the speed of different [machine learning algorithms] on [dataset] using the [timeit] module.
I want you to act as a speed optimizer and compare the speed of different [Python libraries] for [task] in [code snippet].

Prompts for Creating NumPy Array

I want you to act as a data scientist. I need to create a numpy array. This numpy array should have the shape of (x,y,z). Please initialize the numpy array with random values.
I want you to act as a data scientist and create a 1D NumPy array of [length] that contains [values].
I want you to act as a Python developer and create a 2D NumPy array of shape [row, column] that represents the [matrix] in [dataset].
I want you to act as a machine learning expert and create a random 3D NumPy array of shape [batch_size, height, width] that simulates [image data].

Prompts for Clustering

I want you to act as a data scientist and cluster the [customers] in [dataset] into [n] groups based on their [purchase history].
I want you to act as a machine learning expert and develop a [clustering model] that groups the [documents] in [dataset] based on their [content].
I want you to act as a data analyst and visualize the [clusters] in [dataset] using [dimensionality reduction] techniques.

Prompts for Dimensionality Reduction

I want you to act as a data scientist and reduce the [dimensionality] of the [image data] in [dataset] using [principal component analysis] technique.
I want you to act as a data scientist and provide a step-by-step guide on how to perform [t-SNE] for my dataset.
I want you to act as a data scientist and explain the difference between [PCA] and [LDA] and how they can be used for [dimensionality reduction] in my dataset.

Prompts to Tune Hyperparameter

I want you to act as a data scientist and code for me. I have trained a [model name]. Please write the code to tune the hyper parameters.
I want you to act as a hyperparameter tuner and optimize the [hyperparameter] of a [algorithm] algorithm to achieve the highest [metric] on [dataset].
I want you to act as a machine learning expert and use Optuna to perform a Bayesian optimization of [hyperparameters] for a [model] on [dataset].
I want you to act as a data scientist and perform a random search of [hyperparameters] for a [algorithm] algorithm to achieve the best [metric] on [dataset].

Prompts for Data Preprocessing

I want you to act as a data analyst and preprocess the [raw data] in [dataset] by removing [duplicate records] and [missing values].
I want you to act as a data engineer and preprocess the [time-series data] in [dataset] by resampling it to a [lower or higher frequency].
I want you to act as a data scientist and preprocess the [text data] in [dataset] by [tokenizing] it and removing [stop words] and [punctuation marks].

Prompts to Explore Data

I want you to act as a data scientist and code for me. I have a dataset of [describe dataset]. Please write code for data visualisation and exploration.
I want you to act as a data analyst and generate a visualization that shows the distribution of [feature] in [dataset].
I want you to act as a data scientist and generate summary statistics of [feature] in [dataset].
I want you to act as a data explorer and clean [dataset] by removing missing values, duplicates, and outliers.

Prompts to Generate Data

I want you to act as a fake data generator. I need a dataset that has x rows and y columns: [insert column names]
I want you to act as a data generator and create a synthetic dataset with [number of features] features and [number of instances] instances.
I want you to act as a data scientist and generate a time series dataset with [seasonality] seasonality and [trend] trend.
I want you to act as a data simulation expert and generate a dataset that simulates [process] with [parameters] parameters.

Prompts to Address Imbalance Data

I want you to act as a coder. I have trained a machine learning model on an imbalanced dataset. The predictor variable is the column [Insert column name]. In python, how do I oversample and/or undersample my data?
I want you to act as a data scientist and use SMOTE to oversample the minority class of [imbalanced dataset] for classification task.
I want you to act as a machine learning expert and use stratified sampling to balance the distribution of [target variable] in [dataset].
I want you to act as a data engineer and apply random undersampling to address the class imbalance in [imbalanced dataset] for training a model.

Prompts for Natural Language Processing (NLP)

I want you to act as a machine learning expert and build a [text classification model] that classifies [customer feedback] in [dataset] as positive or negative.
I want you to act as a data scientist and analyze the [sentiment] of the [reviews] in [dataset] using [natural language processing] techniques.
I want you to act as a language model researcher and develop a [language model] that can generate [text data] similar to the [training data].

Prompts for Recommender Systems

I want you to act as a data scientist and develop a [content-based recommender system] that suggests [articles] based on [user interests].
I want you to act as a machine learning expert and build a [collaborative filtering model] that recommends [products] to [customers] based on their [purchase history].
I want you to act as a data analyst and evaluate the [accuracy] of the [recommendations] generated by the [recommender system] in [dataset].

Prompts to Train Time Series

I want you to act as a data scientist and code for me. I have a time series dataset [describe dataset]. Please build a machine learning model that predict [target variable]. Please use [time range] as train and [time range] as validation.
I want you to act as a time series expert and build a recurrent neural network that predicts [target variable] based on [time series data].
I want you to act as a data scientist and train a seasonal ARIMA model to forecast [variable] in [time series data] using [forecast horizon] forecast periods.
I want you to act as a machine learning engineer and train a long short-term memory network that detects [event] in [sensor data].

Prompts for Time Series Forecasting

I want you to act as a data scientist and forecast the [sales] of [product] for the next [n months] using [time series forecasting] techniques.
I want you to act as a machine learning expert and develop a [neural network model] that predicts the [stock prices] of [company] based on [historical data].
I want you to act as a time series analyst and analyze the [trends and patterns] in the [weather data] of [city] using [time series decomposition] techniques.

Prompts to Visualize Data

I want you to act as a coder in python. I have a dataset [name] with columns [name]. [Describe graph requirements]
I want you to act as a data visualization expert and create a [type of plot] that shows the relationship between [variable1] and [variable2] in [dataset].
I want you to act as a data scientist and create a [type of plot] that displays the distribution of [variable] in [dataset] and compare it across different [categorical variable].
I want you to act as a data analyst and create a [type of plot] that shows the trend of [variable] over time in [dataset].
I want you to act as a coder. I have a folder of images. [Describe how files are organised in directory] [Describe how you want images to be printed]

Prompts to Explain Model with Lime & Shap

I want you to act as a data scientist and explain the model’s results. I have trained a [library name] model and I would like to explain the output using LIME. Please write the code.
I want you to act as a machine learning specialist and use Lime to explain how a [model] made a prediction for a specific instance in [dataset].
I want you to act as a data scientist and use Lime to identify the important features that contributed to the prediction of [target variable] for [model] on [dataset].
I want you to act as a model explainer and use Lime to explain how a [model] handles the interaction between [features] in [dataset].
I want you to act as a data scientist and explain the model’s results. I have trained a scikit-learn XGBoost model and I would like to explain the output using a series of plots with Shap. Please write the code.

Prompts to Get Feature Importance

I want you to act as a data scientist and explain the model’s results. I have trained a decision tree model and I would like to find the most important features. Please write the code.
I want you to act as a data scientist and use [feature selection algorithm] to calculate the feature importance of [dataset] for [target variable].
I want you to act as a machine learning expert and train a [model] on [dataset] to identify the top [number] most important features for [target variable].
I want you to act as a data analyst and use the permutation feature importance technique to assess the importance of [features] for predicting [target variable] in [dataset].

Prompts to Validate Column

I want you to act as a data scientist. Please write code to test if that my pandas Dataframe [insert requirements here]
I want you to act as a data analyst and validate the [column] in [dataset] to ensure that it contains only [valid data type].
I want you to act as a data quality analyst and validate the [column] in [dataset] to ensure that it contains only [acceptable range of values].
I want you to act as a data scientist and validate the [column] in [dataset] to ensure that it is not affected by [missing values] and [outliers].

Prompts to Write Multithreaded Functions

I want you to act as a coder. Can you help me parallelize this code across threads in python?
I want you to act as a Python developer and write a multithreaded function that can perform [task] on [input] using [number of threads] threads.
I want you to act as a performance optimizer and write a multithreaded function that can parallelize the [bottleneck task] in [code section] of [Python script].
I want you to act as a concurrency expert and write a multithreaded function that can asynchronously process [list of tasks] with the help of a thread pool.

Prompts to Write Regex

I want you to act as a coder. Please write me a regex in python that [describe regex]
I want you to act as a regex writer and write a regular expression that matches [pattern] in [text].
I want you to act as a data engineer and use regex to extract [data] from [log file].
I want you to act as a web scraper and write a regex that matches [pattern] in [HTML source].

Prompts to Write Unit Test

I want you to act as a software developer. Please write unit tests for the function [Insert function]. The test cases are: [Insert test cases]
I want you to act as a Python developer and write a unit test for the [function] in [Python script] to verify that it returns the expected output when provided with [input].
I want you to act as a software engineer and write a unit test to ensure that the [web service] handles [error condition] correctly.
I want you to act as a test automation engineer and write a unit test to verify that the [GUI component] updates the [UI element] correctly when the [user action] is performed.

Prompts for Writing Code

I want you to act as a data scientist using R. Can you write an R script that [Insert requirement here]
I want you to act as a data scientist and write SQL code for me. I have a table with two columns [Insert column names]. I would like to calculate a running average for [which value]. What is the SQL code that works for PostgreSQL 14?
I want you to act as a Linux terminal expert. Please write the code to [describe requirements]
Assume you are given the tables… with the columns… Output the following… `[Question from Data Lemur)
I want you to act as a bot that generates Google Sheets formula. Please generate a formula that [describe requirements]
I want you to act as an Excel VBA developer. Can you write a VBA that [Insert function here]?

Prompts for Explaining Code

I want you to act as a code explainer. What is this code doing? [paste your code]
I want you to act as a Google Sheets formula explainer. Explain the following Google Sheets command. [Insert formula]
I want you to act as a data science instructor. Can you please explain to me what this SQL code is doing? [Insert SQL code]

Prompts for Optimizing Code

I want you to act as a code optimizer. The code is poorly written. How do I correct it? [Insert code here]
I want you to act as a software developer. Please help me improve the time complexity of the code below. [Insert code]
I want you to act as a code optimizer. Can you point out what’s wrong with the following Pandas code and optimize it? [Insert code here]
I want you to act as a code simplifier. Can you simplify the following code? [Insert code here]
I want you to act as an SQL code optimizer. The following code is slow. Can you help me speed it up? [Insert SQL]

Prompts for Translating Code

I want you to act as a code translator. Can you please convert the following code from [python] to [R]? [Insert code]
I want you to act as a coder and write SQL code for MySQL. What is the equivalent of PostgreSQL’s DATE_TRUNC for MySQL?

Prompt to Write Documentation

I want you to act as a software developer. Please provide documentation for function below. [Insert function]

Prompt to Improve Readability

I want you to act as a code analyzer. Can you improve the following code for readability and maintainability? [Insert code]

Prompts to Format SQL

I want you to act as a SQL formatter. Please format the following SQL code. Please convert all reserved keywords to uppercase [Insert requirements]. [Insert Code]

Prompts to Explain Concepts

I want you to act as a data science instructor. Explain [concept] to a five-year-old.
I want you to act as a data science instructor. Explain [concept] to an undergraduate.
I want you to act as a data science instructor. Explain [concept] to a professor.
I want you to act as a data science instructor. Explain [concept] to a business stakeholder.
I want you to act as an answerer on StackOverflow. You can provide code snippets, sample tables and outputs to support your answer. [Insert technical question]

Prompts for Suggesting Ideas

Suggest Ab Testing Steps

I want you to act as a statistician. [Describe context] Please design an A/B test for this purpose. Please include the concrete steps on which statistical test I should run. Suggest Dataset

I want you to act as a data science career coach. I want to build a predictive model for […]. At the same time, I would like to showcase my knowledge in […]. Can you please suggest the five most relevant datasets for my use case?
Suggest Edge Cases

I want you to act as a software developer. Please help me catch edge cases for this function [insert function]
Suggest Feature Engineering

I want you to act as a data scientist and perform feature engineering. I am working on a model that predicts [insert feature name]. There are columns: [Describe columns]. Can you suggest features that we can engineer for this machine learning problem?
Suggest Portfolio Ideas

I want you to act as a data science coach. My background is in […] and I would like to [career goal]. I need to build a portfolio of data science projects that will help me land a role in […] as a […]. Can you suggest five specific portfolio projects that will showcase my expertise in […] and are of relevance to [company]?
Suggest Resources

I want you to act as a data science coach. I would like to learn about [topic]. Please suggest 3 best specific resources. You can include [specify resource type]
Suggest Time Complexity

I want you to act as a software developer. Please compare the time complexity of the two algorithms below. [Insert two functions]
Career Coaching
I want you to act as a career advisor. I am looking for a role as a [role name]. My background is […]. How do I land the role and with what resources exactly in 6 months?

Prompts for Troubleshooting Problem

Correct Python Code

I want you to act as a software developer. This code is supposed to [expected function]. Please help me debug this python code that cannot be run. [Insert function]
Correct Own Chatgpt Code

Your above code is wrong. [Point out what is wrong]. Can you try again?
Correct SQL Code

I want you to act as a SQL code corrector. This code does not run in [your DBMS, e.g. PostgreSQL]. Can you correct it for me? [SQL code here]
Troubleshoot PowerBI Model

I want you to act as a PowerBl modeler. Here is the details of my current project. [Insert details]. Do you see any problems with the table?