
Prepara tus exámenes y mejora tus resultados gracias a la gran cantidad de recursos disponibles en Docsity
Gana puntos ayudando a otros estudiantes o consíguelos activando un Plan Premium
Prepara tus exámenes
Prepara tus exámenes y mejora tus resultados gracias a la gran cantidad de recursos disponibles en Docsity
Prepara tus exámenes con los documentos que comparten otros estudiantes como tú en Docsity
Encuentra los documentos específicos para los exámenes de tu universidad
Estudia con lecciones y exámenes resueltos basados en los programas académicos de las mejores universidades
Responde a preguntas de exámenes reales y pon a prueba tu preparación
Consigue puntos base para descargar
Gana puntos ayudando a otros estudiantes o consíguelos activando un Plan Premium
Comunidad
Pide ayuda a la comunidad y resuelve tus dudas de estudio
Ebooks gratuitos
Descarga nuestras guías gratuitas sobre técnicas de estudio, métodos para controlar la ansiedad y consejos para la tesis preparadas por los tutores de Docsity
Importing_Data_Python_Cheat_Sheet.pdf
Tipo: Apuntes
1 / 1
Esta página no es visible en la vista previa
¡No te pierdas las partes importantes!

Learn R for Data Science Interactively
filename = 'huck_finn.txt' file = open(filename, mode='r') Open the file for reading text = file.read() Read a file’s contents print(file.closed) Check whether file is closed file.close() Close file print(text) with open('huck_finn.txt', 'r') as file: print(file.readline()) Read a single line print(file.readline()) print(file.readline()) filename = ‘mnist.txt’ data = np.loadtxt(filename, delimiter=',', String used to separate values skiprows=2, Skip the first 2 lines usecols=[0,2], Read the 1st and 3rd column dtype=str) The type of the resulting array
filename = 'titanic.csv' data = np.genfromtxt(filename, delimiter=',', names=True, (^) Look for column header dtype=None) Files with one data type Files with mixed data types data_array = np.recfromcsv(filename)
filename = 'winequality-red.csv' data = pd.read_csv(filename, nrows=5, (^) Number of rows of file to read header=None, (^) Row number to use as col names sep='\t', (^) Delimiter to use comment='#', (^) Character to split comments na_values=[ ""]) (^) String to recognize as NA/NaN df.head() Return first DataFrame rows df.tail() Return last DataFrame rows df.index Describe index df.columns Describe DataFrame columns df.info() Info on DataFrame data_array = data.values Convert a DataFrame to an a NumPy array
import pickle with open('pickled_fruit.pkl', 'rb') as file: pickled_data = pickle.load(file)
file = 'urbanpop.xlsx' data = pd.ExcelFile(file) df_sheet2 = data.parse('1960-1966', skiprows=[0], names=['Country', 'AAM: War(2002)']) df_sheet1 = data.parse(0, parse_cols=[0], skiprows=[0], names=['Country'])
import os path = "/usr/tmp" wd = os.getcwd() Store the name of current directory in a string os.listdir(wd) Output contents of the directory in a list os.chdir(path) Change current working directory os.rename("test1.txt", Rename a file "test2.txt") os.remove("test1.txt") Delete an existing file os.mkdir("newdir") Create a new directory data.sheet_names
from sas7bdat import SAS7BDAT with SAS7BDAT('urbanpop.sas7bdat') as file: df_sas = file.to_data_frame()
data = pd.read_stata('urbanpop.dta')
import h5py filename = 'H-H1_LOSC_4_v1-815411200-4096.hdf5' data = h5py.File(filename, 'r')
import scipy.io filename = 'workspace.mat' mat = scipy.io.loadmat(filename)
from sqlalchemy import create_engine engine = create_engine('sqlite://Northwind.sqlite') table_names = engine.table_names()
con = engine.connect() rs = con.execute("SELECT * FROM Orders") df = pd.DataFrame(rs.fetchall()) df.columns = rs.keys() con.close() with engine.connect() as con: rs = con.execute("SELECT OrderID FROM Orders") df = pd.DataFrame(rs.fetchmany(size=5)) df.columns = rs.keys()
df = pd.read_sql_query("SELECT * FROM Orders", engine)
Using the context manager with
import numpy as np import pandas as pd Most of the time, you’ll use either NumPy or pandas to import your data:
To access the sheet names, use the sheet_names attribute:
for key in data ['meta'].keys() Explore the HDF5 structure print(key) Description DescriptionURL Detector Duration GPSstart Observatory Type UTCstart print(data['meta']['Description'].value) Retrieve the value for a key Using the context manager with np.info(np.ndarray.dtype) help(pd.read_csv)
print(mat.keys()) Print dictionary keys for key in data.keys(): Print dictionary keys print(key) meta quality strain pickled_data.values() Return dictionary values print(mat.items()) Returns items in list format of (key, value) tuple pairs
!ls List directory contents of files and directories %cd .. Change current working directory %pwd Return the current working directory path Use the table_names() method to fetch a list of table names:
data_array.dtype Data type of array elements data_array.shape Array dimensions len(data_array) Length of array