一文搞懂 Nginx

Posted on 2022-07-22 In Nginx Views: Disqus: Reading time ≈ 8 mins.

Tengine 下载和文档：http://tengine.taobao.org/
Nginx 官网和文档：http://nginx.org
特点：CPU 和内存占用少，并发能力强，支持 50000 个并发连接数

nginx 相对于 apache 的优点：

轻量级，同样起 web 服务，比 apache 占用更少的内存及资源
抗并发，nginx 处理请求是异步非阻塞的，而 apache 则是阻塞型的，在高并发下 nginx 能保持低资源低消耗高性能
高度模块化的设计，编写模块相对简单
社区活跃，各种高性能模块出品迅速

apache 相对于 nginx 的优点：

rewrite ，比 nginx 的 rewrite 强大
模块超多，基本想到的都可以找到
少 bug ，nginx 的 bug 相对较多

Nginx 配置简洁，Apache 复杂
最核心的区别在于 apache 是同步多进程模型，一个连接对应一个进程；nginx 是异步的，多个连接（万级别）可以对应一个进程

配置文件

conf/nginx.conf

1. 全局块

全局配置（events 块之前都是全局块），比如 worker 进程的数量、错误⽇志的位置

# 运行用户
#user nobody;
# worker 进程数量，通常设置为和 cpu 数量相等
worker_processes 1;

# 全局错误日志
#error_log  logs/error.log;
#error_log  logs/error.log  notice;
#error_log  logs/error.log  info;

# pid 文件位置
#pid        logs/nginx.pid;

# 一个 nginx 进程打开的最多文件描述符数目
#worker_rlimit_nofile 65535;

2. events

主要影响 nginx 服务器与⽤户的网络连接

events {
    # 参考事件模型，use [ kqueue | rtsig | epoll | /dev/poll | select | poll ]; 
    # epoll 模型是Linux 2.6以上版本内核中的高性能网络I/O模型，如果跑在FreeBSD上面，就用kqueue模型。
    use epoll;
    # 每个 workder 进程最⼤连接数为 1024
    worker_connections 1024;
}

单个进程最大连接数：

并发总数：worker_processes * worker_connections
在设置了反向代理的情况下，并发总数 = worker_processes * worker_connections / 4。为什么上面反向代理要除以 4，应该说是一个经验值。
因为并发受 IO 约束，max_clients 的值须小于系统可以打开的最大文件数

可以打开的文件句柄数是多少

系统总限制： /proc/sys/fs/file-max
当前使用句柄数：/proc/sys/fs/file-nr
修改句柄数：ulimit -SHn 65535

–$ cat /proc/sys/fs/file-max，输出：97320
并发连接总数小于系统可以打开的文件句柄总数，这样就在操作系统可以承受的范围之内
所以，worker_connections 的值需根据 worker_processes 进程数目和系统可以打开的最大文件总数进行适当地进行设置，使得并发总数小于操作系统可以打开的最大文件数目

3. http

配置最频繁的部分，虚拟主机的配置，监听端⼝的配置，请求转发、反向代理、负载均衡等
每个 server 可以独立对外提供服务，这样就可以实现一台主机对外提供多个 web 服务，支持三种类型的虚拟主机配置：

基于 ip 的虚拟主机，（一块主机绑定多个 ip 地址）
基于域名的虚拟主机（servername）

基于端口的虚拟主机（listen 如果不写 ip 端口模式）

# http 全局配置
http {
    # 引入mime类型定义文件
    include       mime.types;
    default_type  application/octet-stream;

    # 设定日志格式
    #log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
    #                  '$status $body_bytes_sent "$http_referer" '
    #                  '"$http_user_agent" "$http_x_forwarded_for"';

    #access_log  logs/access.log  main;

    # 开启高效文件传输模式
    # 是否调用 sendfile 函数来输出文件，对于普通应用设为 on，
    # 如果用来进行下载等应用磁盘 I/O 重负载应用，可设置为 off，以平衡磁盘与网络I/O处理速度，降低系统的负载
    sendfile      on;
    # 在 linux/Unix 系统中优化 tcp 数据传输，仅在 sendfile 开启时有效
    #tcp_nopush     on;

    #开启目录列表访问，合适下载服务器，默认关闭。
    autoindex on;

    # 长连接超时时间，秒
    #keepalive_timeout  0;
    keepalive_timeout 65;

    # 反向代理
    upstream myServer {
        # weight：默认为1。weight越大，负载的权重就越大。
        server 127.0.0.1:8080    weight=5  max_conns=800;
        server 127.0.0.1:8081;
    }

    # 开启 gzip 压缩
    #gzip  on;

    # 设置允许压缩的页面最小字节数，默认值是0，不管页面多大都压缩
    # 页面字节数从 header 头的 content-length 中进行获取。
    # 建议设置成大于2k的字节数，小于2k可能会越压越大。
    #gzip_min_length 1k;

    # 设置系统获取几个单位的缓存用于存储 gzip 的压缩结果数据流。
    # 例如 4 4k 代表以 4k 为单位，按照原始数据大小以 4k 为单位的 4 倍申请内存。 
    #     4 8k 代表以 8k 为单位，按照原始数据大小以 8k 为单位的 4 倍申请内存。
    # 如果没有设置，默认值是申请跟原始数据相同大小的内存空间去存储 gzip 压缩结果。
    #gzip_buffers 4 16k;

    # 压缩版本（默认1.1，前端如果是squid2.5请使用1.0）
    #gzip_http_version 1.0;

    # 压缩级别，1-10，数字越大压缩的越好，也越占用 CPU 时间
    #gzip_comp_level 2;

    # 压缩类型，默认就已经包含text/html
    #gzip_types text/plain application/x-javascript text/css application/xml;

    # E6及以下禁止压缩
    #gzip_disable "MSIE [1-6]\.";

    # 给 CDN 和代理服务器使用
    # 针对相同url，可以根据头信息返回压缩和非压缩副本
    #gzip_vary on;


# server 块，虚拟主机
# 每个 server 可以独立对外提供服务，这样就可以实现一台主机对外提供多个web服务
    server {
        # 监听的端口
        listen       80;
        # 定义使用 localhost 访问。域名可以有多个，用空格隔开
        server_name  localhost xxx.com;

        # 编码集
        #charset koi8-r;

        #access_log  logs/host.access.log  main;

        # 默认请求
        # location URI {}    对当前路径及子路径下的所有对象都生效；
        # location = URI {}  精确匹配指定的路径，不包括子路径；
        # location ~ URI {} location ~* URI {}  
        #     模式匹配URI，可使用正则表达式，区分字符大小写，*不区分字符大小写；
        # location ^~ URI {} 禁用正则表达式
        # 优先级：= > ^~ > |* >  /|/dir/
        location / {
            proxy_pass http://myServer/;
            # 默认的网站根目录位置
            # root   html;
            # 默认页
            # index  index.html index.htm;
        }

        # ip 控制
        location  {
      	   deny  IP /IP段
      	   deny  192.168.1.109;
      	   allow 192.168.1.0/24;192.168.0.0/16;192.0.0.0/8
      	}

        # 用户认证访问
        location ~ (.*)\.avi$ {
                 auth_basic  "closed site";
                 auth_basic_user_file conf/users;
        }

        #error_page  404              /404.html;

        # redirect server error pages to the static page /50x.html
        #
        # 错误提示页面
        error_page   500 502 503 504  /50x.html;
        location = /50x.html {
            root   html;
        }

        # proxy the PHP scripts to Apache listening on 127.0.0.1:80
        #
        #location ~ \.php$ {
        #    proxy_pass   http://127.0.0.1;
        #}

        # pass the PHP scripts to FastCGI server listening on 127.0.0.1:9000
        #
        #location ~ \.php$ {
        #    root           html;
        #    fastcgi_pass   127.0.0.1:9000;
        #    fastcgi_index  index.php;
        #    fastcgi_param  SCRIPT_FILENAME  /scripts$fastcgi_script_name;
        #    include        fastcgi_params;
        #}

        # deny access to .htaccess files, if Apache's document root
        # concurs with nginx's one
        #
        #location ~ /\.ht {
        #    deny  all;
        #}
    }
}

应用

1. 反向代理服务器

浏览器设置代理服务器为正向代理。Nginx 作为反向代理服务器接收并转发请求

部署 tomcat，保持默认监听 8080 端⼝
修改配置

多 location 的使⽤（优先级从高到低排序）：
- 精确匹配：location = /lagou {…}
- 匹配路径的前缀：location ^~ /lagou {…}
- 不区分⼤⼩写的正则匹配：location ~* /lagou {…}
- 正则匹配：location ~ /lagou {…}
- 普通路径前缀匹配：location /lagou {…}，

2. 负载均衡

轮询

默认策略，每个请求按时间顺序逐⼀分配到不同的服务器，如果某⼀个服务器下线，能⾃动剔除

upstream lagouServer{
  server 111.229.248.243:8080;
  server 111.229.248.243:8082;
} 

location /abc {
  proxy_pass http://lagouServer/;
}

权重

默认每⼀个负载的服务器都为 1，权重越⾼那么被分配的请求越多（⽤于服务器性能不均衡的场景）

upstream lagouServer{ 
  server 111.229.248.243:8080 weight=1; 
  server 111.229.248.243:8082 weight=2;
}

ip_hash：每个请求按照 ip 的 hash 结果分配，每⼀个客户端的请求会固定分配到同⼀个⽬标服务器处理，可以解决 session 问题
1
2
3
4
5
upstream lagouServer{
ip_hash;
server 111.229.248.243:8080;
server 111.229.248.243:8082;
}
一致性 Hash
ngx_http_upstream_consistent_hash 模块是⼀个第三⽅模块，需要我们下载安装后使⽤
下载并解压
分别执行以下命令：./configure —add-module=/root/ngx_http_consistent_hash-master、make、make install
- consistent_hash $remote_addr：可以根据客户端 ip 映射
- consistent_hash $request_uri：根据客户端请求的 uri 映射
- consistent_hash $args：根据客户端携带的参数进⾏映
  1
  2
  3
  4
  5
  upstream myServer {
  consistent_hash $request_uri;
  server 127.0.0.1:8080;
  server 127.0.0.1:8082;
  }

动静分离

# 静态资源处理，直接去 Nginx 服务器目录中加载
location /static/ {
  root staticData;
}

底层进程机制

Nginx 启动后，以 daemon 多进程⽅式在后台运⾏。包括⼀个 Master 进程和多个 Worker 进程

1. 结构

Master 进程：管理 worker 进程
- 接收外界信号向各 worker 进程发送信号 (./nginx -s reload)
- 监控 worker 进程的运⾏状态，当 worker 进程异常退出后 Master 进程会⾃动重新启动新的 worker 进程等
Worker 进程：worker 进程具体处理⽹络请求，各进程互相之间是独⽴，⼀个 worker 进程处理一个请求。worke r 进程的个数是可以设置的，⼀般设置与机器 cpu 核数⼀致

2. 流程

fork worker 进程：master 进程创建之后，会建⽴好需要监听的的 socket，然后从 master 进程再 fork 出多个 worker 进程。所以所有 worker 进程的监听描述符 listenfd 在新连接到来时都变得可读
进程锁：nginx 使⽤互斥锁来保证只有⼀个 workder 进程能够处理请求，拿到互斥锁的那个进程注册 listenfd 读事件，在读事件⾥调⽤ accept 接受该连接，然后解析、处理、返回客户端

3. nginx 多进程模型好处

每个 worker 进程都是独⽴的，不需要加锁，节省开销
每个 worker 进程都是独⽴的，互不影响，⼀个异常结束，其他的照样能提供服务
多进程模型为 reload 热部署机制提供了⽀撑