Week 3: Python 1 (Teaching notes)

Time (start) Duration Topic Additional materials
00:00 5 min Group formation  
00:05 30 min Basic concepts  
00:35 25 min Part 1: Setup notebook 1
01:00 30 min Part 2: Data items  
01:30 15 min Break  
01:45 30 min Part 3: External libraries  
02:15 70 min Part 4: Functions  
03:25 10 min Wrap-up  
03:35 205 min Overall  
  • Familiarize with Python syntax (assuming you have taken programming courses before)
  • Learn good debugging and development practices
  • Understand how to extend a Python package (CoLRev)
  • Have students start the codespaces on Github from colrev/tutorial branch (see notebook)
  • It is important that students work on Codespaces (not their own machines) to avoid technical setup issues
  • Students can ignore the warning displayed when committing (cannot push due to a lack of permissions)
  • It can be helpful to make mistakes on purpose (e.g., indentation, etc.) to read and interpret the error messages with students.

Group formation

  • Who forked/leads
  • Facilitate group formation, highlight cases where groups are overbooked (ask students to switch)


Before switching to tutorial/after the tutorial: run

pip3 install -e .

to take code from the right repository.

Python

  • Object-oriented
  • Procedural
  • Functional

Interpreted language: unlike Java which requires us to compile the jar file

  • Strongly typed: Explicit conversion required
  • Python fails at runtime when asked to multiply/divide strings and strings. If you need the numbers stored inside a string variable, you need to cast it explicitly (int_var = int("99")).
  • Dynamically typed: typing information is only evaluated when running code (e.g., string * string in if-statement that is not evaluated does not fail)

Example:

Java: 
int count = 2;

Python:
count = 2
# type(count) = class:int (everything is an object)
word = "test"
count*word = "testtest"
word+count # raises TypeError (searches for addition-operation for two strings and fails)
word+str(count) + "test2"
# think: count = "2"
if False:
  cound+word # not error - never executed (dynamically typed/only evaluated at runtime)
if True:
  cound+word # error

Highlight:

  • Our focus: using the programming language to build things (not to understand the programming language)
  • Use google/Stackoverflow on any error/challenge that comes up!
  • Using Code quality checkers and tests

Note: we work in a single directory. In session 2, we will distinguish the code and data directories.

Setup

  • Explain __main__

  • commit: pre-commit hooks!
  • explain later (they do some formatting and warn us if there are code quality issues)

Goal: orientation/read code, try to figure out things

Data items

Optional additional challenge: use the constants as keys (package development docs)

Solution

External libraries

After 2-3 minutes: write BibtexParser on the blackboard

Important: bibtexparser version has changed

Students need to use the old entrypoint (available in the docs menu “Migrating: v1 -> v2”).

# v1
import bibtexparser
with open('bibtex.bib') as bibtex_file:
    bib_database = bibtexparser.load(bibtex_file)

Solution

Solution

Functions

  • Students can check whether the key (journal) is in the record and whether it matches any of the entries with a journal impact factor.

Explain difference between call-by-value and call-by-reference

  • Call-by-value: for “simple data types” (str, int, float)
  • Call-by-reference: for mutable objects (list, dict, object)
# call-by-value

def change_journal(journal: str) -> None:
  journal = "Nature"

journal = "MIS Quarterly"
change_journal(journal)
print(journal)
# prints MIS Quarterly (not Nature)
# due to call-by-value for simple/immutable data types in Python

# call-by-reference

def change_journal(record: dict) -> None:
    record['journal'] = "Nature"

record = {
  "ID": "Smith1990",
  "jornal": "MIS Quarterly"
}
change_journal(record)
print(record)
# prints Nature because record is passed as an object reference
# (mutable type in Python) and modified in the function

Show input() when iterating over the results

Explain difference between positional and keyword arguments

Google

  • “mypy no-untyped-def”
  • “mypy no-untyped-def”
  • “pylint missing-function-docstring”

Pylint example.

example:
************* Module colrev.ops.built_in.search_sources.aisel
colrev/ops/built_in/search_sources/aisel.py:225:19: W3101: Missing timeout argument ...

Solution

Wrap-up

  • Small examples: clarify the “big goal” and start with small steps
  • Linters: already installed
  • Code highlighting (visual studio): functions yellow, variables light blue, instances blue, classes green, strings orange
  • Python debuggers/plugins (for regular Python programming / special cases like memory usage or distributed systems)

Resources