ascii-text-art (3)

Minishell

Code

Table of Contents

  1. Overview
  2. Project Challenges
  3. Team Development Steps for Minishell
    1. Initial Planning and Setup
    2. Research and Design Phase
      1. Shell Operations
      2. External Functions
    3. Parsing and Input Management
      1. Command Reading
      2. Steps for building the parser and AST (Abstract Syntax Tree)
        1. Designing a Syntax Error Checker
        2. Syntax Error Checker Function
        3. Tokenization
        4. Token Structure
        5. Tokenization Strategy
        6. Tokenization Function
        7. Parsing
        8. AST Node Structure
        9. Parsing Function
        10. AST Visualization
    4. Command Execution
    5. Built-in Commands Implementation
    6. Signal Handling
    7. Environment Variable Expansion
    8. Testing and Debugging
  4. Project Structure
  5. Team Members
  6. Resources
  7. Acknowledgements

Overview

Our Minishell project represents a collaborative effort to build a custom shell program in C, inspired by the functionalities of bash, the Bourne Again SHell. As part of the 42 School curriculum, this project served as a unique opportunity for us to deepen our collective understanding of system processes, file descriptors, and the complexities of command-line interfaces. Through the development of our own shell, we engaged in a comprehensive exploration of Unix-based systems, gaining hands-on experience in process control, command interpretation, and environmental variable management.

Project Challenges

Team Development Steps for Minishell

Step 1: Initial Planning and Setup

Step 2: Research and Design Phase

This phase was about understanding the shell's operations, researching the external functions allowed, and dividing them among ourselves to research and explain their usage to the team.

Shell Operations

shell

terminals

External Functions

Reviewed the external functions allowed, dividing them among ourselves to research and explain their usage to the team.

Readline Functions:

Function Description
readline Reads a line from the standard input and returns it.
rl_clear_history Clears the readline history list.
rl_on_new_line Prepares readline for reading input on a new line.
rl_replace_line Replaces the content of the readline current line buffer.
rl_redisplay Updates the display to reflect changes to the input line.
add_history Adds the most recent input to the readline history list.

Standard I/O Functions:

Function Description
printf Outputs formatted data to stdout.

Memory Allocation Functions:

Function Description
malloc Allocates specified bytes of heap memory.
free Deallocates previously allocated memory.

File I/O Functions:

Function Description
write Writes data to a file descriptor.
access Checks calling process's permissions for a file or directory.
open Opens a file or device, returning a file descriptor.
read Reads data from a file descriptor into a buffer.
close Closes a previously opened file descriptor.

Process Control Functions:

Function Description
fork Creates a new process by duplicating the calling process.
wait Suspends execution of the calling process until one of its children terminates.
waitpid Waits for a specific child process to change state.
wait3 Waits for any child process to change state.
wait4 Waits for a specific child process to change state.
signal Handles or ignores signals sent to the process.
sigaction Handles or ignores signals sent to the process.
sigemptyset Initializes and adds signals to a signal set.
sigaddset Initializes and adds signals to a signal set.
kill Sends a signal to a process or a group of processes.
exit Terminates the calling process.

Directory Functions:

Function Description
getcwd Gets the current working directory.
chdir Changes the current working directory.
stat Returns information about a file or a file descriptor.
lstat Returns information about a file or a file descriptor.
fstat Returns information about a file or a file descriptor.
unlink Removes a link to a file.
execve Replaces the current process image with a new process image.

File Descriptor Functions:

Function Description
dup Duplicates a file descriptor.
dup2 Duplicates a file descriptor.
pipe Creates a pipe for inter-process communication.

Directory Functions:

Function Description
opendir Manages directory streams.
readdir Manages directory streams.
closedir Manages directory streams.

Error Handling Functions:

Function Description
strerror Returns a pointer to the textual representation of an error code.
perror Returns a pointer to the textual representation of an error code.

Terminal Functions:

Function Description
isatty Tests whether a file descriptor refers to a terminal.
ttyname Returns the name of the terminal associated with a file descriptor.
ttyslot Returns the name of the terminal associated with a file descriptor.
ioctl Controls device-specific input/output operations.
getenv Returns the value of an environment variable.
tcsetattr Sets and gets terminal attributes.
tcgetattr Sets and gets terminal attributes.
tgetent Terminal handling functions from the termcap library.
tgetflag Terminal handling functions from the termcap library.
tgetnum Terminal handling functions from the termcap library.
tgetstr Terminal handling functions from the termcap library.
tgoto Terminal handling functions from the termcap library.
tput Terminal handling functions from the termcap library.

Step 3: Parsing and Input Management

Command Reading

Readline Library: Implemented readline and integrated add_history ( GNU Readline)

brew install readline

Add the following to the Makefile:

READLINE_INCLUDE = $(shell brew --prefix readline)/include
READLINE_LIB = $(shell brew --prefix readline)/lib
INCLUDES = -I./includes -I./lib/libft -I$(READLINE_INCLUDE)
#include <stdio.h>
#include <readline/readline.h>
#include <readline/history.h>

int main(void)
{
    char *input;

    while (1)
    {
        input = readline("minishell$ ");
        if (!input)
            break;
        if (*input)
            add_history(input);
        printf("Input: %s\n", input);
        free(input);
    }
    return 0;
}

readline(3) - Linux manual page

Steps for building the parser and AST (Abstract Syntax Tree)

a. Designing a Syntax Error Checker

the syntax error checker will be responsible for identifying and reporting syntax errors in the user input.

  1. Unclosed Quotes: verify that all quotes are properly closed.
  2. Misplaced Operators: Detect if pipes | are used incorrectly, such as being at the start or end of the input, or if multiple pipes are used consecutively.
  3. Logical Operators: Detect logical operators such as && and || and report them as not supported.
  4. Invalid Redirections: Detect invalid redirections, such as multiple consecutive redirections or redirections at the start or end of the input.

b. Syntax Error Checker Function

syntax_checker.c:

c. Tokenization

The goal of the tokenization process is to break down the input string into a series of tokens that the parser can easily understand. These tokens represent commands, arguments, pipes, redirections, and other elements of the shell syntax.

d. Token Structure

// Token type enumeration
typedef enum e_token_type
{
    TOKEN_WORD,      // For commands and arguments
    TOKEN_PIPE,      // For '|'
    TOKEN_REDIR_IN,  // For '<'
    TOKEN_REDIR_OUT, // For '>'
    TOKEN_REDIR_APPEND, // For '>>'
    TOKEN_REDIR_HEREDOC, // For '<<'
    TOKEN_ENV_VAR, // For environment variables
}   t_token_type;

// Token structure
typedef struct s_token
{
    t_token_type type;
    char        *value;
    struct s_token *next;
}   t_token;

e. Tokenization Strategy

  1. Whitespace Handling: Skip whitespace outside quotes to separate commands and arguments.
  2. Quoting: Correctly handle single (') and double quotes ("), preserving the text exactly as is within single quotes and allowing for variable expansion and escaped characters within double quotes.
  3. Redirections and Pipes: Detect >, >>, <, <<, and |, treating them as separate tokens while managing any adjacent whitespace.

f. Tokenization Function

The tokenization function will iterate through the input string, identifying and categorizing segments according to the shell syntax.

tokenization.c:

Example:

> ls -l | wc -l > output.txt | ls > output2.txt
Token:  ls                    | Type:  WORD                
--------------------------------------------------
Token:  -l                    | Type:  WORD                
--------------------------------------------------
Token:  |                     | Type:  PIPE                
--------------------------------------------------
Token:  wc                    | Type:  WORD                
--------------------------------------------------
Token:  -l                    | Type:  WORD                
--------------------------------------------------
Token:  >                     | Type:  REDIRECT_OUT        
--------------------------------------------------
Token:  output.txt            | Type:  WORD                
--------------------------------------------------
Token:  |                     | Type:  PIPE                
--------------------------------------------------
Token:  ls                    | Type:  WORD                
--------------------------------------------------
Token:  >                     | Type:  REDIRECT_OUT        
--------------------------------------------------
Token:  output2.txt           | Type:  WORD                
--------------------------------------------------

g. Parsing

The parsing process involves analyzing the tokens to understand their syntactical relationship. This step constructs a representation of the input that reflects the user's intention.

h .AST Node Structure

The AST node structure will represent the input string in a way that reflects the user's intention. The AST will be composed of nodes that represent commands, arguments, redirections, and pipelines.

typedef struct s_ast_node
{
    t_node type type;
    char *args;
    struct s_ast_node *left;
    struct s_ast_node *right;
}   t_ast_node;

i. Parsing Function

The parsing function will iterate through the tokens, building an abstract syntax tree (AST) that represents the input string.

parse.c:

j. AST Visualization

generateastdiagram/generateastdiagram.c:

Example:

ls -l | wc -l > output.txt | ls > output2.txt

graphviz

GraphvizOnline

Step 4: Command Execution

Step 5: Built-in Commands Implementation

builtins

Step 6: Signal Handling

signals

Step 7: Environment Variable Expansion

Step 8: Testing and Debugging

Project Structure

├── includes
│   └── minishell.h
├── lib
│   └── libft
├── Makefile
└── src
    ├── 01_input_validation
    ├── 02_tokenization
    ├── 03_parsing
    ├── 04_execution
    ├── 05_builtin
    ├── 06_signals
    ├── 07_env_var_expansion
    ├── main.c
    └── utils

Team Members

Resources

Acknowledgements

Thanks to reda ghouzraf, MTRX, Nasreddine hanafi, Khalid zerri for their help and support during the project.