54 CHEN

Quick Backend Handbook for Frontend Developers

As a frontend engineer, possessing a certain level of backend knowledge is crucial. In the ever-evolving landscape of web application development, frontend engineers need to comprehensively understand the backend domain, including database management, security, the use of web frameworks, AOP programming paradigm, and the features of commonly used languages and frameworks. This manual will provide you with a brief yet comprehensive overview of backend knowledge, helping you better grasp the key points of backend development.

Databases

1. Database Selection

Choose the type of database that suits the project’s requirements, including relational databases (such as MySQL, PostgreSQL) and NoSQL databases (such as MongoDB, Redis). Understand their characteristics, advantages, and disadvantages to ensure the selection of a database that aligns with the project’s needs.

Name Features Advantages Disadvantages Performance
MySQL - Relational database- Supports SQL query language- ACID transaction support - Mature open-source database with a large community- Good transaction support- Optimisations for complex queries are available - Difficult to scale for large-scale and high-concurrency scenarios- Not suitable for certain non-relational data scenarios- Relatively lower read and write performance Suitable for traditional relational database applications, such as enterprise-level applications
PostgreSQL - Relational database- Supports SQL query language- ACID transaction support - High standard compliance- Supports complex queries and data types- Strong scalability - Requires more hardware resources support- Configuration and Optimisation are complex- Relatively lower performance in heavy write and update scenarios Suitable for applications with complex queries and requiring high standard compliance
MongoDB - NoSQL database, document storage approach - High flexibility and scalability- Supports complex data structures and nested documents- Suitable for operations with a large number of reads - Does not support transactions (in certain scenarios)- Relatively poor write performance- Larger space occupation Suitable for large-scale document-oriented data storage, such as logs, user information, etc.
Redis - NoSQL database, key-value storage approach - Extremely high read and write performance- Supports various data structures, such as strings, lists, sets, etc.- In-memory storage for fast read and write operations - Relatively poor data durability, sensitive to power outages- Does not support complex queries- High memory usage Suitable for high-speed read and write operations, such as caching, session storage, etc.

2. Database Design

Learn to conduct proper database design, including table normalisation, index optimisation, and relationship modeling. Good database design is crucial for application performance and scalability.

Normal Form Example Recommendations
First Normal Form (1NF) Each field contains an atomic value, and there are no repeating columns. - In each table, ensure that each column contains a single atomic value.- Avoid the use of repeating columns; use a relational approach to handle duplicate data.
Second Normal Form (2NF) Satisfies 1NF, and non-prime attributes are fully functionally dependent on the primary key. - Associate non-prime attributes with the primary key, ensuring that non-prime attributes are fully dependent on the entire primary key, not partially.- Use appropriate composite keys.
Third Normal Form (3NF) Satisfies 2NF, and there are no transitive dependencies between non-prime attributes. - Avoid transitive dependencies between non-prime attributes.- Associate non-prime attributes directly with the primary key, not through other non-prime attributes.
Boyce-Codd Normal Form (BCNF) Satisfies 3NF, and each non-prime attribute is dependent on the candidate key (superkey). - Design tables so that each non-prime attribute is directly dependent on the superkey, avoiding partial dependencies.- Typically used in relational databases.
Fourth Normal Form (4NF) Satisfies BCNF, and there are no multi-valued dependencies. - Design tables to eliminate multi-valued dependencies, ensuring that each non-prime attribute depends on a part of the superkey.- Typically used in relational databases.
Fifth Normal Form (5NF) Satisfies 4NF, and there are no join dependencies. - Ensure there are no join dependencies in the table, meaning one cannot derive one non-prime attribute from another non-prime attribute through a non-primary key.- Typically used in relational databases.

In most cases, achieving the third normal form is a good baseline. This helps ensure clarity in data structure and minimizes redundancy. If the project demands high performance, moderate redundancy may be considered to reduce join operations. In such cases, reaching Boyce-Codd Normal Form (BCNF) or higher is advisable.

3. Database Optimisation

In MongoDB, indexing plays a crucial role in performance optimisation, speeding up data queries and improving retrieval efficiency. Here are some key advantages of MongoDB indexes for performance optimisation:

Index Advantages Example Disadvantages or Considerations
Fast Data Retrieval Creating a single-field index on a query field Writing operations may slow down; indexes consume storage space
Accelerating Sorting Operations Creating a single-field index on a sorting field A large number of indexes may increase memory consumption; balancing query and write performance needs careful consideration
Supporting Unique Constraints Creating a unique index Creating indexes consumes additional disk space; maintaining uniqueness may impact write performance
Improving Aggregation Performance Creating a single-field index on an aggregation field Excessive indexes may lead to performance degradation; careful selection of indexed fields is essential
Optimizing Range Queries Creating a single-field index on a range query field Choosing appropriate index types based on the query pattern; not all queries require indexing
Speeding Up Data Updates and Deletions Creating a single-field index on an update or delete field Excessive indexes may decrease write performance; not all fields are suitable for indexing
Supporting Covered Queries When the query fields are part of an index, enabling covered queries Deciding whether to create composite indexes based on query requirements

4. Hardware

The read and write speeds of storage media are influenced by various factors, including hardware types, interface standards, cache sizes, and more. Here’s a comparison of the read and write speeds of common storage media:

Media Read Speed Write Speed Considerations
SSD (Solid State Drive) Very fast, up to several GB/s Very fast, up to several GB/s Limited by write endurance; not suitable for prolonged heavy write operations
SATA Hard Disk Relatively slow, typically in the hundreds of MB/s Relatively slow, typically in the hundreds of MB/s Mechanical structure, constrained by mechanical motion; suitable for large-capacity storage
RAM (Memory) Extremely fast, up to tens of GB/s Extremely fast, up to tens of GB/s Temporary storage, data loss after power loss; suitable for high-performance temporary computing

Understanding the order of magnitude of hardware speeds can help strike a balance between read and write performance and storage space when designing databases. For example, storing hot data on SSDs and cold data on SATA hard disks can achieve a balance between performance and cost.

Security

The backend serves as the last line of defense, ensuring data security. When designing backend applications, various aspects of security, including data encryption, user authentication, and access control, need to be considered. Understanding common security issues and solutions helps in designing secure backend applications.

1. Data Encryption

Common encryption algorithms play a crucial role in securing data during transmission and storage. Here are some commonly used encryption algorithms and their applications on the web:

Algorithm or Protocol Type Application
TLS/SSL Encryption Algorithm Symmetric, Asymmetric, Hash Ensures the confidentiality and integrity of network communication. The HTTPS protocol uses TLS/SSL encryption algorithms to encrypt data during transmission.
AES (Advanced Encryption Standard) Symmetric Used to encrypt sensitive data, such as user passwords stored in databases or encrypted communication between clients and servers.
RSA Asymmetric Primarily used for data digital signatures and key exchange. In web applications, it is commonly used for secure key exchange, ensuring the security of symmetric encryption.
SHA-256 Hash Generates digests for data, commonly used for password storage, digital signatures, etc. In web development, it is often used for secure password storage.
BCrypt Hash Specifically designed for password hashing, enhancing security through salting and multiple iterations. Commonly used for storing user passwords in web applications.
JWT (JSON Web Token) HMAC SHA-256 Generates tokens with signatures for secure information transmission between clients and servers, such as user authentication information. Commonly used for authentication and information transfer in web development.
OAuth HMAC SHA-256 Enables secure user authentication between different applications using tokens for authorization. Commonly used for third-party login and authorization in web development.

2. Common Attacks

When designing backend applications, it’s essential to consider common attack methods, including SQL injection, XSS attacks, CSRF attacks, etc. Here are some common attack methods and preventive measures:

Attack Type Prevention Measures
SQL Injection Use parameterized queries or prepared statements to prevent malicious injection of SQL code into databases. Ensure user input data undergoes proper validation and escaping.
Cross-Site Scripting (XSS) Escape user input appropriately to avoid direct insertion into web pages. Use Content Security Policy (CSP) to prevent the execution of illegal scripts.
Cross-Site Request Forgery (CSRF) Use randomly generated CSRF Tokens embedded in forms. Validate the Token’s presence and correctness in requests, ensuring that only requests from legitimate sources are processed.
Command Injection Do not trust user input; validate and filter user input. Use secure APIs or libraries to execute system commands, avoiding direct concatenation of user input.
Unauthorized Access Use strong passwords and regularly change them. Implement authentication and authorization mechanisms to restrict user access. Use HTTPS to encrypt the transmission of sensitive data.
Sensitive Data Leakage Encrypt sensitive data stored in databases using proper key management. Regularly review and update access permissions.
File Upload Vulnerabilities Strictly limit file uploads, allowing only specific types and sizes of files. Ensure thorough validation and processing of user-uploaded files before storage or delivery.
Denial of Service (DDoS) Use firewalls, reverse proxies, and load balancers to distribute and filter traffic. Regularly conduct capacity planning and performance optimization.
Man-in-the-Middle Attack Use HTTPS for data transmission, ensuring data encryption during transit. Implement appropriate authentication and authorization mechanisms.
Information Leakage and Error Handling Avoid returning detailed error messages directly to clients. In production, log error details and provide user-friendly error messages.

Web Frameworks

1. Choosing the Right Backend Framework

Understand popular backend frameworks such as Express (Node.js), Django (Python), and Spring Boot (Java), and select a framework that suits the project requirements.

Name Language Features
Express.js JavaScript A concise and flexible Node.js web application framework, suitable for building RESTful APIs and single-page applications.
Django Python A high-level Python web framework emphasizing rapid development and the DRY (Don’t Repeat Yourself) principle.
Flask Python A lightweight Python web framework that is simple and easy to use, suitable for small to medium-sized applications.
Spring Boot Java A framework for building production-grade Spring applications quickly. It integrates various Spring ecosystem components.
Ruby on Rails Ruby A full-stack web framework for Ruby, focusing on developer-friendliness and the “convention over configuration” philosophy for rapid development.
Laravel PHP A PHP web framework with elegant syntax, modularity, ORM, and other tools, suitable for building modern PHP applications.
ASP.NET Core C# A cross-platform .NET framework for building high-performance, modern, cloud-native applications, supporting web, cloud, mobile, and IoT.
Nest.js TypeScript A TypeScript-based progressive Node.js framework that combines elements of OOP, FP, and FRP for building scalable server-side applications.
FastAPI Python A modern, fast (thanks to Starlette and Pydantic) web framework for building high-performance APIs in Python.

2. Restful API

Restful API is a design style for building scalable web services. Here are some common Restful API design principles:

Principle Example Description
Resource Identifier (URI) /users/{id} Each resource has a unique identifier accessible and operable through a URI.
Uniform Interface HTTP methods (GET, POST, PUT, DELETE) Use a consistent interface, including URI, HTTP methods, and representation, to simplify system architecture and improve visibility.
Statelessness N/A Each request contains enough information, and the server does not store client state.
Resource Representation JSON, XML Resource representations should contain enough information for clients to understand how to handle the resource.
Hypermedia-Driven (HATEOAS) Embedded links in resource representation Resource representations should include links to relevant operations, guiding clients through further state transitions.
Cacheability Cache-Control header Provide cacheability to improve performance and alleviate server loads.

A simple way to judge whether an API design is Restful is to check if there are verbs in the URL. If there are, it may not be Restful.

3. ORM (Object Relational Mapping)

ORM is a programming technique that converts data between incompatible type systems, such as object-oriented programming languages and relational databases. Here are some common ORM frameworks:

Name Language Features
Django ORM Python 1. Integrated into the Django framework for database interaction.2. Provides model classes for defining database table structures.3. Supports multiple database backends.
Hibernate Java 1. An ORM framework for Java, widely used in Java EE projects.2. Supports mapping to relational databases.3. Provides object and database mapping.
Entity Framework C# 1. Microsoft’s ORM framework for .NET platform.2. Integrates with various databases.3. Provides LINQ query language support.
SQLAlchemy Python 1. An ORM framework for Python, supporting multiple database backends.2. Provides highly flexible query language.3. Supports transaction management.
Sequelize JavaScript 1. A JavaScript ORM framework for Node.js environment.2. Supports various databases like MySQL, PostgreSQL, SQLite.3. Uses Promises for asynchronous operations.
Beanie Python 1. An asynchronous ORM framework for MongoDB in Python.2. Defines database document structures based on Pydantic models.3. Supports asynchronous operations and MongoDB features.
Prisma TypeScript 1. A database access tool for Node.js and TypeScript.2. Generates a type-safe query API automatically.3. Supports multiple database backends such as MySQL, PostgreSQL.

AOP (Aspect-Oriented Programming)

While the CRUD operations discussed earlier cover the basics, real-world development often involves considerations such as logging, caching, transactions, and more. These concerns can be addressed using Aspect-Oriented Programming (AOP).

AOP is a programming paradigm designed to improve code modularity and maintainability by separating cross-cutting concerns from the core business logic. Cross-cutting concerns refer to functionalities scattered throughout the application that are unrelated to the core concerns, such as logging, transaction management, security, etc.

from fastapi import FastAPI, Depends, HTTPException
from fastapi.security import OAuth2PasswordBearer
from jose import JWTError, jwt
from typing import List

app = FastAPI()

# Sample JWT token validation function
async def get_current_user(token: str = Depends(OAuth2PasswordBearer(tokenUrl="token"))):
    credentials_exception = HTTPException(
        status_code=401,
        detail="Could not validate credentials",
        headers={"WWW-Authenticate": "Bearer"},
    )
    try:
        payload = jwt.decode(token, "secret_key", algorithms=["HS256"])
        username: str = payload.get("sub")
        if username is None:
            raise credentials_exception
    except JWTError:
        raise credentials_exception

    return username

# AOP-style logging aspect
def log_request(endpoint):
    async def wrapper(*args, **kwargs):
        print(f"Received request to {endpoint.__name__}")
        response = await endpoint(*args, **kwargs)
        print(f"Request to {endpoint.__name__} completed")
        return response
    return wrapper

# Applying AOP aspect to an endpoint
@app.get("/items/", response_model=List[str])
@log_request
async def read_items(current_user: str = Depends(get_current_user)):
    return [{"item": "Item 1"}, {"item": "Item 2"}]

In the example above, the log_request function acts as an AOP aspect, responsible for logging the start and end of a request. This aspect is applied to the /items/ route using the @log_request decorator. This way, we can add cross-cutting concerns like logging without modifying the core business logic.

Synchronous, Asynchronous, and Multi-threading Thread Pool

Synchronous vs. Asynchronous

Feature Synchronous Asynchronous
Execution Sequential, one operation completes before the next begins Non-sequential, handled through callbacks, event-driven, etc.
Blocking Blocks, program waits for the operation to complete Non-blocking, one operation’s start doesn’t wait for the completion of the previous one
Efficiency May lead to long waiting times, impacting efficiency and performance Can improve efficiency and performance, as multiple operations can be executed concurrently
Examples Traditional blocking I/O operations, like reading files Asynchronous I/O operations, callback functions, event-driven approaches such as asynchronous resource loading on web pages
Programming Style Coding is relatively simple but may prevent the program from executing other tasks Coding may be more complex, requiring handling of callback functions, event listeners, but allows more flexible handling of multiple tasks

Examples of Asynchronous Programming in Javascript

Callback Function:

function fetchData(callback) {
  // Simulating asynchronous operation
  setTimeout(() => {
    const data = 'Async Data';
    callback(data);
  }, 1000);
}

fetchData((result) => {
  console.log(result); // Handling the retrieved data in the callback function
});

Promise:

function fetchData() {
  return new Promise((resolve, reject) => {
    // Simulating asynchronous operation
    setTimeout(() => {
      const data = 'Async Data';
      resolve(data);
      // Alternatively, reject(new Error('Asynchronous operation failed'));
    }, 1000);
  });
}

fetchData()
  .then((result) => {
    console.log(result); // Handling the retrieved data in the 'then' method of the Promise
  })
  .catch((error) => {
    console.error(error); // Handling errors in the Promise
  });

Async/Await:

async function fetchData() {
  return new Promise((resolve) => {
    // Simulating asynchronous operation
    setTimeout(() => {
      const data = 'Async Data';
      resolve(data);
    }, 1000);
  });
}

async function getData() {
  try {
    const result = await fetchData();
    console.log(result); // Handling the retrieved data in the 'async/await' context
  } catch (error) {
    console.error(error); // Handling errors in asynchronous operations
  }
}

getData();

Multithreading and Thread Pool

Threads:

A thread is the smallest unit that the operating system can schedule for execution, consisting of a thread ID, program counter, register set, and stack. In a multithreaded environment, multiple threads share common resources such as code segments and data segments, but each thread has its own registers and stack for storing thread-specific data.

Thread Pool:

A thread pool is a mechanism for managing and reusing threads to improve thread utilization and reduce the overhead of thread creation and destruction. A thread pool consists of a certain number of threads that wait for task allocation and execution. When a task arrives, the thread pool assigns a thread to execute the task, and upon completion, the thread is not destroyed but returned to the thread pool, awaiting the next task. This approach avoids the frequent creation and destruction of threads, improving system performance and responsiveness.

Language Support
Java Native support for threads and thread pools through java.lang.Thread and java.util.concurrent package.
Python Supports threads using the threading and concurrent.futures modules, but limited due to the Global Interpreter Lock (GIL).
C++ Supports native thread libraries such as <thread> and <future>, and third-party libraries like Boost.
JavaScript Supports asynchronous programming using event loops and callback functions; Web Workers enable script execution in independent threads.
C# Native support for threads and thread pools through the System.Threading namespace and the Task class.
TypeScript Similar to JavaScript, supports asynchronous programming relying on event loops and callback mechanisms without direct thread manipulation.

Docker Principles

Docker leverages certain features of the Linux kernel, such as Namespaces and Control Groups (Cgroups). Namespaces isolate processes, networks, file systems, etc., while Cgroups restrict and monitor resource usage, such as CPU and memory. Docker utilizes UnionFS to implement layered container images, providing a lightweight, portable containerization platform to simplify application deployment, management, and scalability.

Docker’s core concepts include Images, Containers, Repositories, and Services.

Images: A Docker image forms the foundation of a container, containing all the necessary files, libraries, and dependencies required to run an application. Images are immutable, ensuring consistent runtime environments across different environments.

Containers: Containers are instances of Docker images. Each container is isolated, with its own file system, process space, and network interface. Containers offer a lightweight, portable runtime environment.

Dockerfile: A Dockerfile is a text file used to define the process of building a Docker image. It allows developers to specify the application’s environment, dependencies, and how to run it.

Networking and Port Mapping: Docker enables communication between containers and the host or other containers. Each container has its own network namespace, and Docker provides network drivers (e.g., bridge, host, overlay) to configure communication between containers. Docker also supports port mapping, allowing internal container ports to be mapped to ports on the host for external access.

Container Orchestration: Docker provides container orchestration tools like Docker Compose and Kubernetes. Docker Compose is used to manage and coordinate the deployment, scaling, and updating of multiple containers. A simple docker-compose.yml file may look like:

version: '3'
services:
  web:
    image: nginx
    ports:
      - "8080:80"
  db:
    image: postgres

Docker Repositories: Docker repositories serve as centralized storage and management for Docker images. They allow users to upload and download Docker images, facilitating sharing and deployment of applications across different environments. Docker officially provides the public Docker Hub repository, where users can find a plethora of official and community-maintained images. Additionally, users can set up their private Docker repositories to meet specific needs or enhance security. Here are some commonly used repositories:

Name Description URL
Docker Hub Official public repository by Docker, containing official and community-maintained images. Docker Hub
Amazon ECR Amazon’s cloud service for storing, managing, and deploying Docker images. Amazon ECR
Google Container Registry Google Cloud’s Docker image repository service, integrated with Google Cloud Platform. GCR
Azure Container Registry Microsoft Azure’s Docker image repository service, integrated with Azure cloud platform. ACR

Docker Services: Docker services provide an abstraction layer for defining and running distributed applications within Docker. Services can encompass multiple instances of containers running on different nodes, enabling load balancing and high availability. Docker services are often used in conjunction with Docker Compose, where a docker-compose.yml file defines the service configuration.

References

https://fastapi.tiangolo.com/

https://spring.io/projects/spring-boot/

https://www.prisma.io/

https://www.sqlalchemy.org/

https://beanie-odm.dev/

https://hub.docker.com/

#Database #Web #Security