import os
cwd = os.getcwd()
cwd
In this tutorial, I only explain you what you need to be a data scientist neither more nor less.
Data scientist need to have these skills:
1.Basic Tools: Like python, R or SQL. You do not need to know everything. What you only need is to learn how to use python
2.Basic Statistics: Like mean, median or standart deviation. If you know basic statistics, you can use python easily.
3.Data Munging: Working with messy and difficult data. Like a inconsistent date and string formatting. As you guess, python helps us.
4.Data Visualization: Title is actually explanatory. We will visualize the data with python like matplot and seaborn libraries.
5.Machine Learning: You do not need to understand math behind the machine learning technique. You only need is understanding basics of machine learning and learning how to implement it while using python.
In this part, you learn:
•User defined function
•Scope
•Nested function
•Default and flexible arguments
•Lambda function
•Anonymous function
•Iterators
•List comprehension
A.User defined function
B.Scope
C. Nested function
D. Default and flexible arguments
E. Lambda function
F. Anonymous function
G. Iterators
H. List Comprehension
What we need to know about functions:
•docstrings: documentation for functions.
Example: for f(): """This is docstring for documentation of function f"""
•tuble: sequence of immutable python objects. cant modify values tuble uses paranthesis like tuble = (1,2,3) unpack tuble into several variables like a,b,c = tuble
# example of what we learn above
def tuble_ex():
""" return defined t tuble"""
t = (1,2,3)
return t
a,b,c = tuble_ex()
print(a,b,c)
What we need to know about scope:
•global: defined main body in script
•local: defined in a function
•built in scope: names in predefined built in scope module such as print, len
Lets make some basic examples
# guess print what
x = 2
def f():
x = 3
return x
print(x) # x = 2 global scope
print(f()) # x = 3 local scope
•Function inside function.
•There is a LEGB rule that is search local scope, enclosing function, global and built in scopes, respectively.
#nested function
def square():
""" return square of value """
def add():
""" add two local variable """
x = 2
y = 3
z = x + y
return z
return add()**2
print(square())
•Default argument example:
def f(a, b=1):
""" b = 1 is default argument"""
•Flexible argument example:
def f(*args):
""" *args can be one or more"""
def f(** kwargs)
""" **kwargs is a dictionary"""
lets write some code to practice
# default arguments
def f(a, b = 1, c = 2):
y = a + b + c
return y
print(f(5))
# what if we want to change default arguments
print(f(5,4,3))
# flexible arguments *args
def f(*args):
for i in args:
print(i)
f(1)
print("")
f(1,2,3,4)
# flexible arguments **kwargs that is dictionary
def f(**kwargs):
""" print key and value of dictionary"""
for key, value in kwargs.items(): # If you do not understand this part turn for loop part and look at dictionary in for loop
print(key, " ", value)
f(country = 'spain', capital = 'madrid', population = 123456)
Faster way of writing function
# lambda function
square = lambda x: x**2 # where x is name of argument
print(square(4))
tot = lambda x,y,z: x+y+z # where x,y,z are names of arguments
print(tot(1,2,3))
Like lambda function but it can take more than one arguments.
•map(func,seq) : applies a function to all the items in a list
number_list = [1,2,3]
y = map(lambda x:x**2,number_list)
print(list(y))
•iterable is an object that can return an iterator
•iterable: an object with an associated iter() method
example: list, strings and dictionaries
•iterator: produces next value with next() method
# iteration example
name = "ronaldo"
it = iter(name)
print(next(it)) # print next iteration
print(*it) # print remaining iteration
zip(): zip lists
# zip example
list1 = [1,2,3,4]
list2 = [5,6,7,8]
z = zip(list1,list2)
print(z)
z_list = list(z)
print(z_list)
un_zip = zip(*z_list)
un_list1,un_list2 = list(un_zip) # unzip returns tuble
print(un_list1)
print(un_list2)
print(type(un_list2))
One of the most important topic of this kernel
We use list comprehension for data analysis often. list comprehension: collapse for loops for building lists into a single line Ex: num1 = [1,2,3] and we want to make it num2 = [2,3,4]. This can be done with for loop. However it is unnecessarily long. We can make it one line code that is list comprehension.
# Example of list comprehension
num1 = [1,2,3]
num2 = [i + 1 for i in num1 ]
print(num2)
[i + 1 for i in num1 ]: list of comprehension
i +1: list comprehension syntax
for i in num1: for loop syntax
i: iterator
num1: iterable object
# Conditionals on iterable
num1 = [5,10,15]
num2 = [i**2 if i == 10 else i-5 if i < 7 else i+5 for i in num1]
print(num2)
Data and Package:
#Package: matplotlib, seaborn,numpy, and pandas (for dataframe data structure manipulation)
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt
import seaborn as sns # visualization tool
data = pd.read_csv('pokemon.csv')
data.head(5)
# lets return pokemon csv and make one more list comprehension example
# lets classify pokemons whether they have high or low speed. Our threshold is average speed.
threshold = sum(data.Speed)/len(data.Speed)
data["speed_level"] = ["high" if i > threshold else "low" for i in data.Speed]
data.loc[:10,["speed_level","Speed"]] # we will learn loc more detailed later
Cheers!
/itsmecevi