# dataclasses

**Creating Custom Data Types Using `dataclasses`**

**Introduction:**

Imagine you have a grocery list with items like milk, eggs, and bread. You want to create a custom data type to store these items. Instead of using a dictionary, you can use `dataclasses` to automatically generate special methods (like **init** and **repr**) for your custom data type.

**Using `dataclass` Decorator:**

The `dataclass` decorator is used to define a custom data type. For example, let's define a `GroceryItem` data type:

```python
from dataclasses import dataclass

@dataclass
class GroceryItem:
  name: str  # Name of the item
  quantity: int  # Quantity of the item
```

This decorator automatically generates methods like **init** and **repr** for `GroceryItem`.

**init Method:**

The **init** method initializes the data type with the given attributes. For example:

```python
item1 = GroceryItem("Milk", 2)
```

This creates an instance of `GroceryItem` with name "Milk" and quantity 2.

**repr Method:**

The **repr** method provides a human-readable string representation of the data type. For example, for `item1`:

```python
print(item1)  # Output: GroceryItem(name='Milk', quantity=2)
```

**Real-World Example:**

Suppose you have a list of students with their names and grades. Instead of using a list of dictionaries, you can create a custom data type `Student` using `dataclasses`:

```python
from dataclasses import dataclass

@dataclass
class Student:
  name: str
  grade: float

students = [
  Student("Alice", 95),
  Student("Bob", 87),
]
```

You can now access and manipulate student data easily:

```python
for student in students:
  print(f"{student.name}: {student.grade}")
```

**Potential Applications:**

`dataclasses` are useful in various scenarios:

* **Data Structures:** Create custom data types for storing complex data in a structured manner.
* **Data Validation:** Validate data types by defining type annotations in the class attributes.
* **Code Generation:** Automatically generate boilerplate code for data manipulation, reducing the need for manual coding.

***

**Decorators:**

Decorators are a way to add extra functionality to a class without changing the class itself. The `@dataclass` decorator adds special methods to a class, making it easier to work with data.

**Fields:**

In a dataclass, fields are the variables that store the data. Fields are defined with their type, like `name: str`. The decorator will add methods to the class that use these fields.

**Methods Added by the Decorator:**

The decorator can add several methods to the class, depending on the parameters used:

* `__init__`: Initializes the class with the provided fields.
* `__repr__`: Returns a string representation of the class, showing the field names and values.
* `__eq__`: Compares two instances of the class for equality.
* `__hash__`: Generates a hash value for the class, used for hashing and sets.
* `__match_args__`: A tuple of field names, used for pattern matching.

**Parameters:**

* `init`: Controls whether an `__init__` method is added.
* `repr`: Controls whether a `__repr__` method is added.
* `eq`: Controls whether an `__eq__` method is added.
* `order`: Controls whether comparison methods (`__lt__`, `__le__`, `__gt__`, `__ge__`) are added.
* `unsafe_hash`: Controls whether a `__hash__` method is added, even if the class is mutable.
* `frozen`: Makes the class immutable, preventing fields from being changed.
* `match_args`: Controls whether a `__match_args__` tuple is added.
* `kw_only`: Marks all fields as keyword-only arguments in the `__init__` method.
* `slots`: Uses `__slots__` to optimize memory usage.
* `weakref_slot`: Adds a special slot to support weak references.

**Real-World Examples:**

Consider a `Customer` class with fields `name`, `age`, and `email`:

```python
@dataclass
class Customer:
    name: str
    age: int
    email: str
```

* `__init__`: Initializes the `Customer` with the fields provided during creation.
* `__repr__`: Returns a string like `"Customer(name='John', age=30, email='john@example.com')"`.
* `__eq__`: Compares two `Customer` instances based on the fields.

**Applications:**

Dataclasses are useful for:

* Storing data in a structured way.
* Serializing and deserializing data easily.
* Creating immutable objects for thread safety.
* Simplifying data validation and comparison.
* Reducing boilerplate code when working with data.

***

### Understanding Field Function

Imagine your data as a bunch of LEGO bricks. Each brick represents a field, holding a specific value like a name or age. To create a custom LEGO structure (dataclass), you can use the `field` function to modify how these bricks behave.

#### Customization Options

With `field`, you have several customization options:

* **Default Value:** Set a default value for each brick. For example, if you have a `name` field, you can set its default to "John". Without this, every brick starts empty.
* **Default Factory:** Sometimes, you want bricks to act as LEGO factories. This allows you to create new bricks on the fly. For instance, a `friends` field could automatically create an empty list where you can add friends.
* **Initialization:** Choose if the brick should be included in the initial construction of your LEGO structure (when you first create the dataclass object).
* **Representation:** Decide if the brick should be included when you describe your structure (like printing its details).
* **Hashing:** Specify if the brick should be considered when calculating a unique identifier for your structure.
* **Comparison:** Determine if the brick should be included when comparing two structures.
* **Metadata:** Add extra information to the brick, like notes or custom properties.
* **Keyword-Only:** Mark the brick as only accessible when creating the structure, not when modifying it later.

#### Code Snippet for Customization

```python
@dataclass
class Person:
    name: str = "John"              # Default Value
    age: int = field(default_factory=int)  # Default Factory
    email: str = field(init=False)     # Exclude from initialization
    friends: list[str] = field(repr=False)  # Exclude from representation
```

#### Real-World Application

* **Default Value:** Set default values for user registration forms to save time and effort.
* **Default Factory:** Create dynamic lists or sets that start empty but can be populated later, like in a shopping cart.
* **Initialization:** Hide certain fields from the initialization process, allowing them to be set separately.
* **Representation:** Control what information is displayed when printing or inspecting objects, protecting privacy or simplifying debugging.
* **Hashing:** Ensure that unique identifiers are calculated consistently across objects.
* **Comparison:** Define customized comparison logic, essential for sorting and finding duplicates.
* **Metadata:** Attach custom attributes to fields, like units of measurement or data validation rules.
* **Keyword-Only:** Enforce strict data validation by making certain fields accessible only during object creation.

***

**Field Objects**

*Imagine a field object as a blueprint for a specific piece of data that you want to store in your custom class.*

**Purpose of Field Objects:**

* They define the properties and characteristics of each data item in your class.

**Attributes of Field Objects:**

* **name:** The name of the data item (e.g., "age", "city").
* **type:** The data type of the item (e.g., int, str).
* **default (optional):** The default value for the item if none is provided when creating an instance of the class.
* **Other attributes (optional):** These can be used to customize the behavior of the field object, but are not commonly used.

**Creating Field Objects:**

* You don't create them directly.
* They are created automatically when you use the `@dataclass` decorator on your class.
* You can access them using the `fields()` function after defining your class.

**Real-World Example:**

Imagine you have a `Person` class that stores someone's age and city:

```python
@dataclass
class Person:
    age: int
    city: str
```

* The `age` field would have a `name` of "age" and a `type` of `int`.
* The `city` field would have a `name` of "city" and a `type` of `str`.

**Potential Applications:**

* **Data Validation:** You can use field objects to validate the data that is entered into your class instances.
* **Automatic Documentation:** Field objects can be used to automatically generate documentation for your classes.
* **Code Reusability:** You can reuse field objects in multiple classes to ensure consistency and reduce duplication.

***

**Fields Function**

The `fields()` function in the `dataclasses` module is used to get all the fields that define a dataclass.

**How to use it:**

You can call the `fields()` function with either a dataclass or an instance of a dataclass as the argument.

For example:

```python
from dataclasses import dataclass, fields

@dataclass
class Person:
    name: str
    age: int

print(fields(Person))  # prints a tuple of Field objects for each field in Person
```

**What it returns:**

The `fields()` function returns a tuple of `Field` objects. Each `Field` object contains information about a field, including its name, type, and other metadata.

**Example:**

The following example shows how to use the `fields()` function to get the fields of a dataclass and print their names and types:

```python
from dataclasses import dataclass, fields

@dataclass
class Person:
    name: str
    age: int

for field in fields(Person):
    print(f"{field.name}: {field.type}")
```

**Output:**

```
name: str
age: int
```

**Applications:**

The `fields()` function can be useful for various purposes, such as:

* Getting information about the fields of a dataclass
* Validating the values of a dataclass instance
* Generating documentation for a dataclass

***

**Simplified Explanation of `asdict()` function in Python's `dataclasses` module:**

**What is `asdict()`?**

`asdict()` is a function that converts a `dataclass` object into a dictionary.

**What is a `dataclass`?**

A `dataclass` is a class that makes it easy to define Python objects with specific data fields.

**How does `asdict()` work?**

`asdict()` takes a `dataclass` object and creates a dictionary with the object's field names as keys and the corresponding field values as values.

**Why use `asdict()`?**

You might use `asdict()` to convert a `dataclass` into a dictionary for any of the following reasons:

* To send the data in JSON format (JSON requires data in dictionary form)
* To store the data in a database (databases usually store data in tables with rows and columns, which is similar to a dictionary)
* To pass the data to another function that expects a dictionary as input

**Example:**

Let's say you have a `dataclass` called `Person` with two fields: `name` and `age`. You can create a dictionary from a `Person` object like this:

```python
from dataclasses import dataclass, asdict

@dataclass
class Person:
    name: str
    age: int

person = Person("John Doe", 30)
person_dict = asdict(person)

print(person_dict)  # Output: {'name': 'John Doe', 'age': 30}
```

**Additional Features:**

* **Custom Dictionary Factory:** You can provide a custom dictionary factory function to `asdict()` to control the type of dictionary that is created. By default, a regular `dict` is used.
* **Shallow Copy:** If you want a shallow copy of the dictionary, you can use a comprehension like this:

```python
person_dict = {field.name: getattr(person, field.name) for field in fields(person)}
```

**Real-World Applications:**

* **Web Development:** Convert `dataclasses` to dictionaries for use in JSON responses.
* **Database Storage:** Convert `dataclasses` to dictionaries for easy storage in databases.
* **Configuration Management:** Define configuration options as `dataclasses` and convert them to dictionaries for easy access in code.

***

### Dataclasses

**What are dataclasses?**

Dataclasses are a way to create classes in Python that are specifically designed to hold data. They are similar to regular classes, but they come with some built-in functionality that makes them easier to use for data storage and manipulation.

**Advantages of dataclasses:**

* They are easy to create and use.
* They are immutable by default, which means that the data they contain cannot be changed once it has been created.
* They have built-in support for serialization and deserialization, which makes it easy to store and retrieve data from them.

**How to create a dataclass:**

To create a dataclass, you use the `@dataclass` decorator. The decorator takes a class as its argument, and it adds the necessary functionality to the class to make it a dataclass.

For example, the following code creates a simple dataclass called `Person`:

```python
@dataclass
class Person:
    name: str
    age: int
```

**Using dataclasses:**

Once you have created a dataclass, you can use it just like any other class. You can create instances of the class, and you can access the data in the instances using the dot operator.

For example, the following code creates an instance of the `Person` class and accesses the data in the instance:

```python
person = Person("John Doe", 30)

print(person.name)  # Output: "John Doe"
print(person.age)  # Output: 30
```

**Customizing dataclasses:**

You can customize the behavior of dataclasses by specifying additional arguments to the `@dataclass` decorator. These arguments include:

* `init`: This argument specifies the constructor function for the dataclass.
* `repr`: This argument specifies the representation function for the dataclass.
* `eq`: This argument specifies the equality function for the dataclass.
* `order`: This argument specifies the ordering function for the dataclass.

For example, the following code creates a dataclass with a custom constructor function:

```python
@dataclass
class Person:
    name: str
    age: int

    def __init__(self, name, age):
        self.name = name.upper()
        self.age = age
```

**Real-world applications of dataclasses:**

Dataclasses can be used in a variety of real-world applications, including:

* Data storage and retrieval
* Data validation
* Data transformation
* Data serialization and deserialization

### Astuple

**What is astuple?**

`astuple` is a function that converts a dataclass instance to a tuple. The function takes the dataclass instance as its first argument, and it takes an optional `tuple_factory` argument that specifies the factory function to use for creating the tuple.

**How to use astuple:**

To use `astuple`, you simply call the function with the dataclass instance as its argument. The function will return a tuple that contains the data from the dataclass instance.

For example, the following code converts a `Person` instance to a tuple:

```python
person = Person("John Doe", 30)

person_tuple = astuple(person)

print(person_tuple)  # Output: ("John Doe", 30)
```

**Real-world applications of astuple:**

`astuple` can be used in a variety of real-world applications, including:

* Converting dataclass instances to tuples for storage in a database
* Converting dataclass instances to tuples for transmission over a network
* Converting dataclass instances to tuples for use in a template engine

***

**make\_dataclass() Function**

**Purpose:** Creates a new dataclass with a custom name, fields, and other properties.

**Parameters:**

* **cls\_name (str):** The name of the new dataclass.
* **fields (iterable):** A list of fields to be included in the dataclass. Each field can be specified as a name (str), a tuple of (name, type), or a tuple of (name, type, Field) where Field is a dataclass field descriptor.
* **bases (tuple, optional):** A tuple of base classes for the new dataclass.
* **namespace (dict, optional):** A dictionary of additional attributes to add to the dataclass.
* **init (bool, optional):** True if the dataclass should have an automatically generated **init**() method.
* **repr (bool, optional):** True if the dataclass should have an automatically generated **repr**() method.
* **eq (bool, optional):** True if the dataclass should have an automatically generated **eq**() method.
* **order (bool, optional):** True if the dataclass should have automatically generated **lt**(), **le**(), **gt**(), and **ge**() methods for ordering.
* **unsafe\_hash (bool, optional):** True if the dataclass should have an automatically generated **hash**() method. Note that this can lead to unexpected behavior if the dataclass is mutable.
* **frozen (bool, optional):** True if the dataclass should be immutable.
* **match\_args (bool, optional):** True if the dataclass should have an automatically generated **match\_args**() method.
* **kw\_only (bool, optional):** True if the dataclass should have keyword-only arguments in its **init**() method.
* **slots (bool, optional):** True if the dataclass should use slots for its attribute storage.
* **weakref\_slot (bool, optional):** True if the dataclass should have a weak reference slot.
* **module (str, optional):** The module to which the dataclass should belong.

**Return Value:**

* A new dataclass with the specified properties.

**Real World Example:**

Suppose we want to create a dataclass called `Person` with two fields: `name` and `age`. We can use the `make_dataclass()` function as follows:

```python
from dataclasses import make_dataclass

Person = make_dataclass("Person", [("name", str), ("age", int)])
```

This will create a dataclass with the following properties:

* Name: Person
* Fields: name (str), age (int)
* Automatically generated **init**(), **repr**(), and **eq**() methods
* No base classes
* No additional attributes in the namespace
* No custom behavior for any of the optional parameters

**Potential Applications:**

Dataclasses are useful for creating simple data structures with well-defined fields and behaviors. They are particularly useful in cases where the data structure is immutable and does not require complex operations or logic. Some potential applications include:

* Representing configuration data within an application.
* Defining data structures for data exchange between different modules or components.
* Creating simple data models for use in web applications or APIs.

***

**Purpose:**

Data classes are a new feature in Python that make it easy to define classes that hold data, such as the fields of a database record. The `replace()` function is a convenient way to create a new data class instance with the same type as an existing one, but with some of the fields replaced with new values.

**How to use `replace()`:**

To use the `replace()` function, you pass it an existing data class instance as the first argument, and then specify the fields you want to replace as keyword arguments. For example:

```
from dataclasses import dataclass, replace

@dataclass
class Person:
    name: str
    age: int

person = Person("Alice", 25)
new_person = replace(person, name="Bob")
```

In this example, the `replace()` function creates a new `Person` instance with the same age as the original `person`, but with the name "Bob".

**What `replace()` does:**

The `replace()` function works by creating a new instance of the same type as the original object, and then setting the specified fields to the new values. It does this by calling the `__init__()` method of the data class, which ensures that any `__post_init__()` method is also called.

**Things to watch out for:**

There are a few things to watch out for when using the `replace()` function:

* You can only replace fields that have been defined as fields in the data class. If you try to replace a field that has not been defined, you will get a `TypeError`.
* You cannot replace fields that have been defined as `init=False`. This is because these fields are not meant to be changed after the object has been created.
* If you want to replace an `init=False` field, you will need to create a new data class instance and set the field explicitly in the `__init__()` method.

**Applications of `replace()`:**

The `replace()` function can be used in a variety of situations, such as:

* Updating the fields of an existing data class instance
* Creating new data class instances with different values for some fields
* Copying data class instances without copying all of their fields

***

**is\_dataclass**

* Checks if a given object is a dataclass or an instance of one.

**MISSING**

* A special value representing a missing default or default factory.

**KW\_ONLY**

* A type annotation used in dataclasses to mark fields as keyword-only.
* Keyword-only fields must be specified as keywords when creating an instance of the dataclass.

**FrozenInstanceError**

* An error raised when trying to modify attributes of an immutable dataclass (i.e., one with `frozen=True`).

**Post-Initialization Processing**

* Dataclasses support post-initialization processing through the `__post_init__` method.
* `__post_init__` allows you to perform additional actions after the dataclass is created.

**Real-World Applications**

* **is\_dataclass:** Useful for checking if an object is a dataclass, e.g., for type checking or introspection.
* **MISSING:** Represents missing values or defaults, ensuring that your code is clear and consistent.
* **KW\_ONLY:** Enforces keyword-only arguments for certain fields, promoting code clarity and consistency.
* **FrozenInstanceError:** Prevents accidental modifications to immutable dataclasses, ensuring the integrity of your data.
* **Post-Initialization Processing:** Allows for additional processing or setup after dataclass creation, offering flexibility and extensibility.

**Simplified Code Example**

```python
from dataclasses import dataclass, KW_ONLY

# Define a simple dataclass with KW_ONLY field
@dataclass
class Point:
    x: float
    y: float = MISSING  # Missing default value
    _: KW_ONLY  # Keyword-only argument delimiter
    z: float = None  # Optional keyword-only field

# Create an instance of the dataclass
point = Point(1.0, y=2.0, z=3.0)

# Check if 'point' is a dataclass
print(is_dataclass(point))  # True

# Check if 'y' has a missing default value
print(point.y is MISSING)  # True

# Try to modify 'x' (non-keyword-only field)
point.x = 4.0

# Try to modify 'z' (keyword-only field)
try:
    point.z = 5.0
except FrozenInstanceError as e:
    print(e)  # Output: "Cannot modify frozen instance"
```

***

**Python Data Classes**

**Overview**

Data classes in Python are a simple way to create classes that store data, like a table in a database.

**Creating a Data Class**

You create a data class using the `@dataclass` decorator, which is like a magic word that tells Python to make a special class. For example:

```python
@dataclass
class Student:
    name: str
    age: int
    gpa: float
```

This creates a class called `Student` that has three fields: `name`, `age`, and `gpa`.

**Initialising Data Class**

We can create a new student object by passing values to the fields when we create the object:

```python
student1 = Student("John Smith", 20, 3.5)
```

**Accessing Data Class Fields**

We can access the fields of a data class object using dot notation:

```python
print(student1.name)  # John Smith
print(student1.age)  # 20
print(student1.gpa)  # 3.5
```

**Default Values**

Fields can have default values set when creating the data class:

```python
@dataclass
class Person:
    name: str = "John Doe"  # Default name is "John Doe"
    age: int = 20  # Default age is 20
```

**Post-Initialisation Method**

The `__post_init__` method runs after the object is created. This method can be used to perform additional calculations or logic based on the values of the fields.

```python
@dataclass
class Book:
    title: str
    author: str
    pages: int

    def __post_init__(self):
        self.page_count = self.pages  # Calculate the page count
```

**Frozen Data Classes**

Frozen data classes cannot be modified after they are created. This can be useful for security or performance reasons.

```python
@dataclass(frozen=True)
class Immutable:
    value: str
```

**Inheritance in Data Classes**

Data classes can inherit from other data classes:

```python
@dataclass
class Teacher(Student):
    subject: str
```

**Applications of Data Classes**

Data classes can be used in a variety of applications, such as:

* Representing data from a database
* Creating configuration objects
* Storing user input
* Creating immutable objects
* Implementing object-oriented programming principles
