

下载APP





关闭

讲堂

算法训练营

极客商城

客户端下载

兑换中心

企业服务

免费资讯

渠道合作

推荐作者

11 | 线程：如何让复杂的项目并行执行？

2019-04-19 刘超

趣谈Linux操作系统

进入课程



讲述：刘超

时长20:04大小18.39M



上一节我们讲了如何创建进程，这一节我们来看如何创建线程。

为什么要有线程？

其实，对于任何一个进程来讲，即便我们没有主动去创建线程，进程也是默认有一个主线程的。线程是负责执行二进制指令的，它会根据项目执行计划书，一行一行执行下去。进程要比线程管的宽多了，除了执行指令之外，内存、文件系统等等都要它来管。

所以，进程相当于一个项目，而线程就是为了完成项目需求，而建立的一个个开发任务。默认情况下，你可以建一个大的任务，就是完成某某功能，然后交给一个人让它从头做到尾，这就是主线程。但是有时候，你发现任务是可以拆解的，如果相关性没有非常大前后关联关系，就可以并行执行。

例如，你接到了一个开发任务，要开发 200 个页面，最后组成一个网站。这时候你就可以拆分成 20 个任务，每个任务 10 个页面，并行开发。都开发完了，再做一次整合，这肯定比依次开发 200 个页面快多了。

那我们能不能成立多个项目组实现并行开发呢？当然可以了，只不过这样做有两个比较麻烦的地方。

第一个麻烦是，立项。涉及的部门比较多，总是劳师动众。你本来想的是，只要能并行执行任务就可以，不需要把会议室都搞成独立的。另一个麻烦是，项目组是独立的，会议室是独立的，很多事情就不受你控制了，例如一旦有了两个项目组，就会有沟通问题。

所以，使用进程实现并行执行的问题也有两个。第一，创建进程占用资源太多；第二，进程之间的通信需要数据在不同的内存空间传来传去，无法共享。

除了希望任务能够并行执行，有的时候，你作为项目管理人员，肯定要管控风险，因此还会预留一部分人作为应急小分队，来处理紧急的事情。

例如，主线程正在一行一行执行二进制命令，突然收到一个通知，要做一点小事情，应该停下主线程来做么？太耽误事情了，应该创建一个单独的线程，单独处理这些事件。

另外，咱们希望自己的公司越来越有竞争力。要想实现远大的目标，我们不能把所有人力都用在接项目上，应该预留一些人力来做技术积累，比如开发一些各个项目都能用到的共享库、框架等等。

在 Linux 中，有时候我们希望将前台的任务和后台的任务分开。因为有些任务是需要马上返回结果的，例如你输入了一个字符，不可能五分钟再显示出来；而有些任务是可以默默执行的，例如将本机的数据同步到服务器上去，这个就没刚才那么着急。因此这样两个任务就应该在不同的线程处理，以保证互不耽误。

如何创建线程？

看来多线程还是有很多好处的。接下来我们来看一下，如何使用线程来干一件大事。

假如说，现在我们有 N 个非常大的视频需要下载，一个个下载需要的时间太长了。按照刚才的思路，我们可以拆分成 N 个任务，分给 N 个线程各自去下载。

我们知道，进程的执行是需要项目执行计划书的，那线程是一个项目小组，这个小组也应该有自己的项目执行计划书，也就是一个函数。我们将要执行的子任务放在这个函数里面，比如上面的下载任务。

这个函数参数是 void 类型的指针，用于接收任何类型的参数。我们就可以将要下载的文件的文件名通过这个指针传给它。

为了方便，我将代码整段都贴在这里，这样你把下面的代码放在一个文件里面就能成功编译。

当然，这里我们不是真的下载这个文件，而仅仅打印日志，并生成一个一百以内的随机数，作为下载时间返回。这样，每个子任务干活的同时在喊：“我正在下载，终于下载完了，用了多少时间。”

 #include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
 
#define NUM_OF_TASKS 5
 
void *downloadfile(void *filename)
{
   printf("I am downloading the file %s!\n", (char *)filename);
   sleep(10);
   long downloadtime = rand()%100;
   printf("I finish downloading the file within %d minutes!\n", downloadtime);
   pthread_exit((void *)downloadtime);
}
 
int main(int argc, char *argv[])
{
   char files[NUM_OF_TASKS][20]={"file1.avi","file2.rmvb","file3.mp4","file4.wmv","file5.flv"};
   pthread_t threads[NUM_OF_TASKS];
   int rc;
   int t;
   int downloadtime;
 
   pthread_attr_t thread_attr;
   pthread_attr_init(&thread_attr);
   pthread_attr_setdetachstate(&thread_attr,PTHREAD_CREATE_JOINABLE);
 
   for(t=0;t<NUM_OF_TASKS;t++){
     printf("creating thread %d, please help me to download %s\n", t, files[t]);
     rc = pthread_create(&threads[t], &thread_attr, downloadfile, (void *)files[t]);
     if (rc){
       printf("ERROR; return code from pthread_create() is %d\n", rc);
       exit(-1);
     }
   }
 
   pthread_attr_destroy(&thread_attr);
 
   for(t=0;t<NUM_OF_TASKS;t++){
     pthread_join(threads[t],(void**)&downloadtime);
     printf("Thread %d downloads the file %s in %d minutes.\n",t,files[t],downloadtime);
   }
 
   pthread_exit(NULL);
}复制代码

一个运行中的线程可以调用 pthread_exit 退出线程。这个函数可以传入一个参数转换为 (void *) 类型。这是线程退出的返回值。

接下来，我们来看主线程。在这里面，我列了五个文件名。接下来声明了一个数组，里面有五个 pthread_t 类型的线程对象。

接下来，声明一个线程属性 pthread_attr_t。我们通过 pthread_attr_init 初始化这个属性，并且设置属性 PTHREAD_CREATE_JOINABLE。这表示将来主线程程等待这个线程的结束，并获取退出时的状态。

接下来是一个循环。对于每一个文件和每一个线程，可以调用 pthread_create 创建线程。一共有四个参数，第一个参数是线程对象，第二个参数是线程的属性，第三个参数是线程运行函数，第四个参数是线程运行函数的参数。主线程就是通过第四个参数，将自己的任务派给子线程。

任务分配完毕，每个线程下载一个文件，接下来主线程要做的事情就是等待这些子任务完成。当一个线程退出的时候，就会发送信号给其他所有同进程的线程。有一个线程使用 pthread_join 获取这个线程退出的返回值。线程的返回值通过 pthread_join 传给主线程，这样子线程就将自己下载文件所耗费的时间，告诉给主线程。

好了，程序写完了，开始编译。多线程程序要依赖于 libpthread.so。

 gcc download.c -lpthread复制代码

编译好了，执行一下，就能得到下面的结果。

 # ./a.out
creating thread 0, please help me to download file1.avi
creating thread 1, please help me to download file2.rmvb
I am downloading the file file1.avi!
creating thread 2, please help me to download file3.mp4
I am downloading the file file2.rmvb!
creating thread 3, please help me to download file4.wmv
I am downloading the file file3.mp4!
creating thread 4, please help me to download file5.flv
I am downloading the file file4.wmv!
I am downloading the file file5.flv!
I finish downloading the file within 83 minutes!
I finish downloading the file within 77 minutes!
I finish downloading the file within 86 minutes!
I finish downloading the file within 15 minutes!
I finish downloading the file within 93 minutes!
Thread 0 downloads the file file1.avi in 83 minutes.
Thread 1 downloads the file file2.rmvb in 86 minutes.
Thread 2 downloads the file file3.mp4 in 77 minutes.
Thread 3 downloads the file file4.wmv in 93 minutes.
Thread 4 downloads the file file5.flv in 15 minutes.复制代码

这里我们画一张图总结一下，一个普通线程的创建和运行过程。

线程的数据

线程可以将项目并行起来，加快进度，但是也带来的负面影响，过程并行起来了，那数据呢？

我们把线程访问的数据细分成三类。下面我们一一来看。

第一类是线程栈上的本地数据，比如函数执行过程中的局部变量。前面我们说过，函数的调用会使用栈的模型，这在线程里面是一样的。只不过每个线程都有自己的栈空间。

栈的大小可以通过命令 ulimit -a 查看，默认情况下线程栈大小为 8192（8MB）。我们可以使用命令 ulimit -s 修改。

对于线程栈，可以通过下面这个函数 pthread_attr_t，修改线程栈的大小。

 int pthread_attr_setstacksize(pthread_attr_t *attr, size_t stacksize);复制代码

主线程在内存中有一个栈空间，其他线程栈也拥有独立的栈空间。为了避免线程之间的栈空间踩踏，线程栈之间还会有小块区域，用来隔离保护各自的栈空间。一旦另一个线程踏入到这个隔离区，就会引发段错误。

第二类数据就是在整个进程里共享的全局数据。例如全局变量，虽然在不同进程中是隔离的，但是在一个进程中是共享的。如果同一个全局变量，两个线程一起修改，那肯定会有问题，有可能把数据改的面目全非。这就需要有一种机制来保护他们，比如你先用我再用。这一节的最后，我们专门来谈这个问题。

那线程能不能像进程一样，也有自己的私有数据呢？如果想声明一个线程级别，而非进程级别的全局变量，有没有什么办法呢？虽然咱们都是一个大组，分成小组，也应该有点隐私。

这就是第三类数据，线程私有数据（Thread Specific Data），可以通过以下函数创建：

 int pthread_key_create(pthread_key_t *key, void (*destructor)(void*))复制代码

可以看到，创建一个 key，伴随着一个析构函数。

key 一旦被创建，所有线程都可以访问它，但各线程可根据自己的需要往 key 中填入不同的值，这就相当于提供了一个同名而不同值的全局变量。

我们可以通过下面的函数设置 key 对应的 value。

 int pthread_setspecific(pthread_key_t key, const void *value)复制代码

我们还可以通过下面的函数获取 key 对应的 value。

 void *pthread_getspecific(pthread_key_t key)复制代码

而等到线程退出的时候，就会调用析构函数释放 value。

数据的保护

接下来，我们来看共享的数据保护问题。

我们先来看一种方式，Mutex，全称 Mutual Exclusion，中文叫互斥。顾名思义，有你没我，有我没你。它的模式就是在共享数据访问的时候，去申请加把锁，谁先拿到锁，谁就拿到了访问权限，其他人就只好在门外等着，等这个人访问结束，把锁打开，其他人再去争夺，还是遵循谁先拿到谁访问。

我这里构建了一个“转账”的场景。相关的代码我放到这里，你可以看看。

 #include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
 
#define NUM_OF_TASKS 5
 
int money_of_tom = 100;
int money_of_jerry = 100;
// 第一次运行去掉下面这行
pthread_mutex_t g_money_lock;
 
void *transfer(void *notused)
{
  pthread_t tid = pthread_self();
  printf("Thread %u is transfering money!\n", (unsigned int)tid);
  // 第一次运行去掉下面这行
  pthread_mutex_lock(&g_money_lock);
  sleep(rand()%10);
  money_of_tom+=10;
  sleep(rand()%10);
  money_of_jerry-=10;
  // 第一次运行去掉下面这行
  pthread_mutex_unlock(&g_money_lock);
  printf("Thread %u finish transfering money!\n", (unsigned int)tid);
  pthread_exit((void *)0);
}
 
int main(int argc, char *argv[])
{
  pthread_t threads[NUM_OF_TASKS];
  int rc;
  int t;
  // 第一次运行去掉下面这行
  pthread_mutex_init(&g_money_lock, NULL);
 
  for(t=0;t<NUM_OF_TASKS;t++){
    rc = pthread_create(&threads[t], NULL, transfer, NULL);
    if (rc){
      printf("ERROR; return code from pthread_create() is %d\n", rc);
      exit(-1);
    }
  }
  
  for(t=0;t<100;t++){
    // 第一次运行去掉下面这行
    pthread_mutex_lock(&g_money_lock);
    printf("money_of_tom + money_of_jerry = %d\n", money_of_tom + money_of_jerry);
    // 第一次运行去掉下面这行
    pthread_mutex_unlock(&g_money_lock);
  }
  // 第一次运行去掉下面这行
  pthread_mutex_destroy(&g_money_lock);
  pthread_exit(NULL);
}复制代码

这里说，有两个员工 Tom 和 Jerry，公司食堂的饭卡里面各自有 100 元，并行启动 5 个线程，都是 Jerry 转 10 元给 Tom，主线程不断打印 Tom 和 Jerry 的资金之和。按说，这样的话，总和应该永远是 200 元。

在上面的程序中，我们先去掉 mutex 相关的行，就像注释里面写的那样。在没有锁的保护下，在 Tom 的账户里面加上 10 元，在 Jerry 的账户里面减去 10 元，这不是一个原子操作。

我们来编译一下。

 gcc mutex.c -lpthread复制代码

然后运行一下，就看到了下面这样的结果。

 [root@deployer createthread]# ./a.out
Thread 508479232 is transfering money!
Thread 491693824 is transfering money!
Thread 500086528 is transfering money!
Thread 483301120 is transfering money!
Thread 516871936 is transfering money!
money_of_tom + money_of_jerry = 200
money_of_tom + money_of_jerry = 200
money_of_tom + money_of_jerry = 220
money_of_tom + money_of_jerry = 220
money_of_tom + money_of_jerry = 230
money_of_tom + money_of_jerry = 240
Thread 483301120 finish transfering money!
money_of_tom + money_of_jerry = 240
Thread 508479232 finish transfering money!
Thread 500086528 finish transfering money!
money_of_tom + money_of_jerry = 220
Thread 516871936 finish transfering money!
money_of_tom + money_of_jerry = 210
money_of_tom + money_of_jerry = 210
Thread 491693824 finish transfering money!
money_of_tom + money_of_jerry = 200
money_of_tom + money_of_jerry = 200复制代码

可以看到，中间有很多状态不正确，比如两个人的账户之和出现了超过 200 的情况，也就是 Tom 转入了，Jerry 还没转出。

接下来我们在上面的代码里面，加上 mutex，然后编译、运行，就得到了下面的结果。

 [root@deployer createthread]# ./a.out
Thread 568162048 is transfering money!
Thread 576554752 is transfering money!
Thread 551376640 is transfering money!
Thread 542983936 is transfering money!
Thread 559769344 is transfering money!
Thread 568162048 finish transfering money!
Thread 576554752 finish transfering money!
money_of_tom + money_of_jerry = 200
money_of_tom + money_of_jerry = 200
money_of_tom + money_of_jerry = 200
Thread 542983936 finish transfering money!
Thread 559769344 finish transfering money!
money_of_tom + money_of_jerry = 200
money_of_tom + money_of_jerry = 200
Thread 551376640 finish transfering money!
money_of_tom + money_of_jerry = 200
money_of_tom + money_of_jerry = 200
money_of_tom + money_of_jerry = 200
money_of_tom + money_of_jerry = 200复制代码

这个结果就正常了。两个账号之和永远是 200。这下你看到锁的作用了吧？

使用 Mutex，首先要使用 pthread_mutex_init 函数初始化这个 mutex，初始化后，就可以用它来保护共享变量了。

pthread_mutex_lock() 就是去抢那把锁的函数，如果抢到了，就可以执行下一行程序，对共享变量进行访；如果没抢到，就被阻塞在那里等待。

如果不想被阻塞，可以使用 pthread_mutex_trylock 去抢那把锁，如果抢到了，就可以执行下一行程序，对共享变量进行访问；如果没抢到，不会被阻塞，而是返回一个错误码。

当共享数据访问结束了，别忘了使用 pthread_mutex_unlock 释放锁，让给其他人使用，最终调用 pthread_mutex_destroy 销毁掉这把锁。

这里我画个图，总结一下 Mutex 的使用流程。

在使用 Mutex 的时候，有个问题是如果使用 pthread_mutex_lock()，那就需要一直在那里等着。如果是 pthread_mutex_trylock()，就可以不用等着，去干点儿别的，但是我怎么知道什么时候回来再试一下，是不是轮到我了呢？能不能在轮到我的时候，通知我一下呢？

这其实就是条件变量，也就是说如果没事儿，就让大家歇着，有事儿了就去通知，别让人家没事儿就来问问，浪费大家的时间。

但是当它接到了通知，来操作共享资源的时候，还是需要抢互斥锁，因为可能很多人都受到了通知，都来访问了，所以条件变量和互斥锁是配合使用的。

我这里还是用一个场景给你解释。

你这个老板，招聘了三个员工，但是你不是有了活才去招聘员工，而是先把员工招来，没有活的时候员工需要在那里等着，一旦有了活，你要去通知他们，他们要去抢活干（为啥要抢活？因为有绩效呀！），干完了再等待，你再有活，再通知他们。

具体的样例代码我也放在这里。你可以直接编译运行。

 #include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
 
#define NUM_OF_TASKS 3
#define MAX_TASK_QUEUE 11
 
char tasklist[MAX_TASK_QUEUE]="ABCDEFGHIJ";
int head = 0;
int tail = 0;
 
int quit = 0;
 
pthread_mutex_t g_task_lock;
pthread_cond_t g_task_cv;
 
void *coder(void *notused)
{
  pthread_t tid = pthread_self();
 
  while(!quit){
 
    pthread_mutex_lock(&g_task_lock);
    while(tail == head){
      if(quit){
        pthread_mutex_unlock(&g_task_lock);
        pthread_exit((void *)0);
      }
      printf("No task now! Thread %u is waiting!\n", (unsigned int)tid);
      pthread_cond_wait(&g_task_cv, &g_task_lock);
      printf("Have task now! Thread %u is grabing the task !\n", (unsigned int)tid);
    }
    char task = tasklist[head++];
    pthread_mutex_unlock(&g_task_lock);
    printf("Thread %u has a task %c now!\n", (unsigned int)tid, task);
    sleep(5);
    printf("Thread %u finish the task %c!\n", (unsigned int)tid, task);
  }
 
  pthread_exit((void *)0);
}
 
int main(int argc, char *argv[])
{
  pthread_t threads[NUM_OF_TASKS];
  int rc;
  int t;
 
  pthread_mutex_init(&g_task_lock, NULL);
  pthread_cond_init(&g_task_cv, NULL);
 
  for(t=0;t<NUM_OF_TASKS;t++){
    rc = pthread_create(&threads[t], NULL, coder, NULL);
    if (rc){
      printf("ERROR; return code from pthread_create() is %d\n", rc);
      exit(-1);
    }
  }
 
  sleep(5);
 
  for(t=1;t<=4;t++){
    pthread_mutex_lock(&g_task_lock);
    tail+=t;
    printf("I am Boss, I assigned %d tasks, I notify all coders!\n", t);
    pthread_cond_broadcast(&g_task_cv);
    pthread_mutex_unlock(&g_task_lock);
    sleep(20);
  }
 
  pthread_mutex_lock(&g_task_lock);
  quit = 1;
  pthread_cond_broadcast(&g_task_cv);
  pthread_mutex_unlock(&g_task_lock);
 
  pthread_mutex_destroy(&g_task_lock);
  pthread_cond_destroy(&g_task_cv);
  pthread_exit(NULL);
}复制代码

首先，我们创建了 10 个任务，每个任务一个字符，放在一个数组里面，另外有两个变量 head 和 tail，表示当前分配的工作从哪里开始，到哪里结束。如果 head 等于 tail，则当前的工作分配完毕；如果 tail 加 N，就是新分配了 N 个工作。

接下来声明的 pthread_mutex_t g_task_lock 和 pthread_cond_t g_task_cv，是用于通知和抢任务的，工作模式如下图所示：

图中左边的就是员工的工作模式，对于每一个员工 coder，先要获取锁 pthread_mutex_lock，这样才能保证一个任务只分配给一个员工。

然后，我们要判断有没有任务，也就是说，head 和 tail 是否相等。如果不相等的话，就是有任务，则取出 head 位置代表的任务 task，然后将 head 加一，这样整个任务就给了这个员工，下个员工来抢活的时候，也需要获取锁，获取之后抢到的就是下一个任务了。当这个员工抢到任务后，pthread_mutex_unlock 解锁，让其他员工可以进来抢任务。抢到任务后就开始干活了，这里没有真正开始干活，而是 sleep，也就是摸鱼了 5 秒。

如果发现 head 和 tail 相当，也就是没有任务，则需要调用 pthread_cond_wait 进行等待，这个函数会把锁也作为变量传进去。这是因为等待的过程中需要解锁，要不然，你不干活，等待睡大觉，还把门给锁了，别人也干不了活，而且老板也没办法获取锁来分配任务。

一开始三个员工都是在等待的状态，因为初始化的时候，head 和 tail 相等都为零。

现在我们把目光聚焦到老板这里，也就是主线程上。它初始化了条件变量和锁，然后创建三个线程，也就是我们说的招聘了三个员工。

接下来要开始分配任务了，总共 10 个任务。老板分四批分配，第一批一个任务三个人抢，第二批两个任务，第三批三个任务，正好每人抢到一个，第四批四个任务，可能有一个员工抢到两个任务。这样三个员工，四批工作，经典的场景差不多都覆盖到了。

老板分配工作的时候，也是要先获取锁 pthread_mutex_lock，然后通过 tail 加一来分配任务，这个时候 head 和 tail 已经不一样了，但是这个时候三个员工还在 pthread_cond_wait 那里睡着呢，接下来老板要调用 pthread_cond_broadcast 通知所有的员工，来活了，醒醒，起来干活。

这个时候三个员工醒来后，先抢锁，生怕老板只分配了一个任务，让别人抢去。当然抢锁这个动作是 pthread_cond_wait 在收到通知的时候，自动做的，不需要我们另外写代码。

抢到锁的员工就通过 while 再次判断 head 和 tail 是否相同。这次因为有了任务，不相同了，所以就抢到了任务。而没有抢到任务的员工，由于抢锁失败，只好等待抢到任务的员工释放锁，抢到任务的员工在 tasklist 里面拿到任务后，将 head 加一，然后就释放锁。这个时候，另外两个员工才能从 pthread_cond_wait 中返回，然后也会再次通过 while 判断 head 和 tail 是否相同。不过已经晚了，任务都让人家抢走了，head 和 tail 又一样了，所以只好再次进入 pthread_cond_wait，接着等任务。

这里，我们只解析了第一批一个任务的工作的过程。如果运行上面的程序，可以得到下面的结果。我将整个过程在里面写了注释，你看起来就比较容易理解了。

 [root@deployer createthread]# ./a.out
// 招聘三个员工，一开始没有任务，大家睡大觉
No task now! Thread 3491833600 is waiting!
No task now! Thread 3483440896 is waiting!
No task now! Thread 3475048192 is waiting!
// 老板开始分配任务了，第一批任务就一个，告诉三个员工醒来抢任务
I am Boss, I assigned 1 tasks, I notify all coders!
// 员工一先发现有任务了，开始抢任务
Have task now! Thread 3491833600 is grabing the task !
// 员工一抢到了任务 A，开始干活
Thread 3491833600 has a task A now! 
// 员工二也发现有任务了，开始抢任务，不好意思，就一个任务，让人家抢走了，接着等吧
Have task now! Thread 3483440896 is grabing the task !
No task now! Thread 3483440896 is waiting!
// 员工三也发现有任务了，开始抢任务，你比员工二还慢，接着等吧
Have task now! Thread 3475048192 is grabing the task !
No task now! Thread 3475048192 is waiting!
// 员工一把任务做完了，又没有任务了，接着等待
Thread 3491833600 finish the task A !
No task now! Thread 3491833600 is waiting!
// 老板又有新任务了，这次是两个任务，叫醒他们
I am Boss, I assigned 2 tasks, I notify all coders!
// 这次员工二比较积极，先开始抢，并且抢到了任务 B
Have task now! Thread 3483440896 is grabing the task !
Thread 3483440896 has a task B now! 
// 这次员工三也聪明了，赶紧抢，要不然没有年终奖了，终于抢到了任务 C
Have task now! Thread 3475048192 is grabing the task !
Thread 3475048192 has a task C now! 
// 员工一上次抢到了，这次抢的慢了，没有抢到，是不是飘了
Have task now! Thread 3491833600 is grabing the task !
No task now! Thread 3491833600 is waiting!
// 员工二做完了任务 B，没有任务了，接着等待
Thread 3483440896 finish the task B !
No task now! Thread 3483440896 is waiting!
// 员工三做完了任务 C，没有任务了，接着等待
Thread 3475048192 finish the task C !
No task now! Thread 3475048192 is waiting!
// 又来任务了，这次是三个任务，人人有份
I am Boss, I assigned 3 tasks, I notify all coders!
// 员工一抢到了任务 D，员工二抢到了任务 E，员工三抢到了任务 F
Have task now! Thread 3491833600 is grabing the task !
Thread 3491833600 has a task D now! 
Have task now! Thread 3483440896 is grabing the task !
Thread 3483440896 has a task E now! 
Have task now! Thread 3475048192 is grabing the task !
Thread 3475048192 has a task F now! 
// 三个员工都完成了，然后都又开始等待
Thread 3491833600 finish the task D !
Thread 3483440896 finish the task E !
Thread 3475048192 finish the task F !
No task now! Thread 3491833600 is waiting!
No task now! Thread 3483440896 is waiting!
No task now! Thread 3475048192 is waiting!
// 公司活越来越多了，来了四个任务，赶紧干呀
I am Boss, I assigned 4 tasks, I notify all coders!
// 员工一抢到了任务 G，员工二抢到了任务 H，员工三抢到了任务 I
Have task now! Thread 3491833600 is grabing the task !
Thread 3491833600 has a task G now! 
Have task now! Thread 3483440896 is grabing the task !
Thread 3483440896 has a task H now! 
Have task now! Thread 3475048192 is grabing the task !
Thread 3475048192 has a task I now! 
// 员工一和员工三先做完了，发现还有一个任务开始抢
Thread 3491833600 finish the task G !
Thread 3475048192 finish the task I !
// 员工三没抢到，接着等
No task now! Thread 3475048192 is waiting!
// 员工一抢到了任务 J，多做了一个任务
Thread 3491833600 has a task J now! 
// 员工二这才把任务 H 做完，黄花菜都凉了，接着等待吧
Thread 3483440896 finish the task H !
No task now! Thread 3483440896 is waiting!
// 员工一做完了任务 J，接着等待
Thread 3491833600 finish the task J !
No task now! Thread 3491833600 is waiting!复制代码

总结时刻

这一节，我们讲了如何创建线程，线程都有哪些数据，如何对线程数据进行保护。

写多线程的程序是有套路的，我这里用一张图进行总结。你需要记住的是，创建线程的套路、mutex 使用的套路、条件变量使用的套路。

课堂练习

这一节讲了多线程编程的套路，但是我没有对于每一个函数进行详细的讲解，相关的还有很多其他的函数可以调用，这需要你自己去学习。这里我给你推荐一本书，你可以系统地学习一下。另外，上面的代码，建议你一定要上手编译运行一下。

欢迎留言和我分享你的疑惑和见解，也欢迎你收藏本节内容，反复研读。你也可以把今天的内容分享给你的朋友，和他一起学习、进步。

10 | 进程：公司接这么多项目，如何管？

12 | 进程数据结构（上）：项目多了就需要项目管理系统

 写留言

精选留言(56)

刘強

2019-04-19

两个线程操作同一临界区时，通过互斥锁保护，若A线程已经加锁，B线程再加锁时候会被阻塞，直到A释放锁，B再获得锁运行，进程B必须不停的主动获得锁、检查条件、释放锁、再获得锁、再检查、再释放，一直到满足运行的条件的时候才可以（而此过程中其他线程一直在等待该线程的结束），这种方式是比较消耗系统的资源的。而条件变量同样是阻塞，还需要通知才能唤醒，线程被唤醒后，它将重新检查判断条件是否满足，如果还不满足，该线程就休眠了，应该仍阻塞在这里，等待条件满足后被唤醒，节省了线程不断运行浪费的资源。这个过程一般用while语句实现。当线程B发现被锁定的变量不满足条件时会自动的释放锁并把自身置于等待状态，让出CPU的控制权给其它线程。其它线程此时就有机会去进行操作，当修改完成后再通知那些由于条件不满足而陷入等待状态的线程。这是一种通知模型的同步方式，大大的节省了CPU的计算资源，减少了线程之间的竞争，而且提高了线程之间的系统工作的效率。这种同步方式就是条件变量。
      这是网上的一段话。我觉得说的很清楚了。
      如果学过java的话，其实就是线程之间的互斥和协作，条件变量就是用来协作的，对应java里的wait()和notify()函数。
     我个人觉得读这个专栏必须有一定的基础理论，具体的说起码看过相关的书籍，了解个大概。如果只有一些语言的基础，没有看过相关计算机体系或者操作系统方面的书籍，看起来会很费劲，不知所云。就我自己来说，我看过于渊写的《一个操作系统的实现》，《linux内核设计与实现》《现代操作系统》《intel汇编程序》《深入理解计算机系统》《unix高级环境编程》等，尤其是于渊写的这本书，从计算机加电开始，一直到多进程，进程间通信，从汇编到c语言，都有完整的代码，都是作者自己亲手写的，可操作性极强。还有csapp（深入理解计算机系统）这本书，所有人公认的学计算机必须看的。还有很多，总之我想说的是自己必须去看书学习，仅仅想靠一个专栏的学习来了解一个东西是远远不够的。
      话又说回来，你可能会问你看了这么多书，早应该精通了吧，还到这儿来干嘛？其实这就是我最大的问题，书看的太多，理论知道的不少，但是动手实践太少，就是所谓的眼高手低。而且不实践，看的多，忘的多。一方面是工作原因，除非去大公司，哪有机会让你实践底层的技术。二是自己太懒，没有耐性，任何可用的东西都是一行行代码经过千锤百炼形成的。书看了不少仅仅满足了自己的求知欲，却没有弄出任何有用的东西。
       来这儿也是想看看理论是如何通过一行行代码落地的，也学学作者的实践经验，加深一些概念的理解。
       多看书没错，但效率比较低，从实践中不断总结，思考才是快速成长的正确方法。
       道理知道很多，但还是过不好这一生。理论掌握不少，还是写不出有用的代码。悲催！

展开



 58
安排

2019-04-19

很多同学不理解这个BOSS给员工分任务的场景为什么要用条件变量，因为用互斥量也可以实现。员工等在互斥量上一样会进入睡眠，等BOSS释放互斥锁时也会唤醒这些员工。这样看来根本没有用条件变量的必要。

我们可以换一个思路，如果不使用条件变量，而且BOSS也不是一直生产任务，那么这时互斥量就会空闲出来，总会有一个员工能拿到锁，员工线程这时候就会在while循环中不停的获得锁，判断状态，释放锁，这样的话就会十分消耗cpu资源了。
这时候我们可能会想到，在while循环中加个睡眠，例如5秒，也就是员工线程每隔5秒来执行一次获得锁，判断状态，释放锁的操作，这样就会减轻cpu资源的消耗。
但是实际应用场景中，我们无法判断到底间隔几秒来执行一次这个获得锁，判断状态，释放锁的流程，时间长了可能影响吞吐量，时间短了会造成cpu利用率过高，所以这时候引入了条件变量，将主动查询方式改成了被动通知方式，效率也就上去了。
文中的例子很容易让新手迷惑，我也不保证上述理解是对的，希望老师能够指点一二。

展开

 2

 19
why

2019-04-19

- 线程复制执行二进制指令
- 多进程缺点: 创建进程占用资源多; 进程间通信需拷贝内存, 不能共享
- 线程相关操作
    - pthread_exit(A), A 是线程退出的返回值
    - pthread_attr_t 线程属性, 用辅助函数初始化并设置值; 用完需要销毁
    - pthread_create 创建线程, 四个参数(线程对象, 属性, 运行函数, 运行参数)
    - pthread_join 获取线程退出返回值, 多线程依赖 libpthread.so
    - 一个线程退出, 会发送信号给其他所有同进程的线程
- 线程中有三类数据
    - 线程栈本地数据, 栈大小默认 8MB; 线程栈之间有保护间隔, 若误入会引发段错误
    - 进程共享的全局数据
    - 线程级别的全局变量(线程私有数据, pthread_key_create(key, destructer)); key 所有线程都可以访问, 可填入各自的值(同名不同值的全局变量)
- 数据保护
    - Mutex(互斥), 初始化; lock(没抢到则阻塞)/trylock(没抢到则返回错误码); unlock; destroy
    - 条件变量(通知), 收到通知, 还是要抢锁(由 wait 函数执行); 因此条件变量与互斥锁配合使用
    - 互斥锁所谓条件变量的参数, wait 函数会自动解锁/加锁
    - broadcast(通知); destroy

展开



 7
赵又新

2019-04-19

手敲了第一个代码，报未定义变量的错，查资料发现文章中PTHREAD_CTREATE_JOINABL打错了，末尾少了个E

展开



 6
刘強

2019-04-19

凌晨学习！

展开



 5
算不出流源

2019-04-19

刘老师，关于条件变量和互斥锁配合使用的一个问题，既然pthread_mutex_lock() 本身就是阻塞的，那么在你举的老板招聘三个员工抢活干的例子中为什么还需要再适使用条件变量pthread_cond_wait()来等活干呢，条件变量等待的时候不也是阻塞的吗？没有理解增加条件变量的意义，请老师指导

展开

 2

 4
张志强

2019-04-19

这一章干货很多，操作系统课上都要分好几节讲。消化需要不少时间



 4
一只特立独行的猪

2019-05-04

macOS下执行有不同的结果：

creating thread 0, please help me to download file1.avi
creating thread 1, please help me to download file2.rmvb
creating thread 2, please help me to download file3.mp4
I am downloading the file file1.avi!
I am downloading the file file2.rmvb!
creating thread 3, please help me to download file4.wmv
I am downloading the file file3.mp4!
creating thread 4, please help me to download file5.flv
I am downloading the file file4.wmv!
I am downloading the file file5.flv!
I finish downloading the file within 7 minutes!
I finish downloading the file within 49 minutes!
I finish downloading the file within 73 minutes!
I finish downloading the file within 58 minutes!
I finish downloading the file within 30 minutes!
Thread 0 downloads the file file1.avi in 49 minutes.
Thread 0 downloads the file file1.avi in 7 minutes.
Thread 1 downloads the file file2.rmvb in 7 minutes.
Thread 0 downloads the file file1.avi in 73 minutes.
Thread 1 downloads the file file2.rmvb in 73 minutes.
Thread 2 downloads the file file3.mp4 in 73 minutes.
Thread 0 downloads the file file1.avi in 58 minutes.
Thread 1 downloads the file file2.rmvb in 58 minutes.
Thread 2 downloads the file file3.mp4 in 58 minutes.
Thread 3 downloads the file file4.wmv in 58 minutes.
Thread 0 downloads the file file1.avi in 30 minutes.
Thread 1 downloads the file file2.rmvb in 30 minutes.
Thread 2 downloads the file file3.mp4 in 30 minutes.
Thread 3 downloads the file file4.wmv in 30 minutes.
Thread 4 downloads the file file5.flv in 30 minutes.

展开

 2

 3
不一样的烟火

2019-04-28

while(!quit){

    pthread_mutex_lock(&g_task_lock);
    while(tail == head){
      if(quit){
        pthread_mutex_unlock(&g_task_lock);
        pthread_exit((void *)0);
      }
      printf("No task now! Thread %u is waiting!\n", (unsigned int)tid);
      pthread_cond_wait(&g_task_cv, &g_task_lock);
      printf("Have task now! Thread %u is grabing the task !\n", (unsigned int)tid);
    }
    char task = tasklist[head++];
    pthread_mutex_unlock(&g_task_lock);
    printf("Thread %u has a task %c now!\n", (unsigned int)tid, task);
    sleep(5);
    printf("Thread %u finish the task %c!\n", (unsigned int)tid, task);
}
这段其实有个疑问，上一个线程执行完加锁等待任务之后，下一个线程为什么还可以打印自己在等待任务，打印那一段不是已经被锁住了吗

展开

 2

 1
唐龙

2019-04-20

本节的创建五条线程模拟文件下载的例子，我这里操作有一些问题。首先，我在主线程里使用同一个变量downloadtime接收子线程的返回值的时候，输出和预想的不一样。thread 0返回时没有问题，thread 1返回时会输出两条，downloadtime一样但是前面会把thread0 file0重新输出一遍再输出一遍thread1 file1……thread4返回时会输出五条，五个一样的downloadtime，一条thread0 file0，一条thread1 file1……最后一条一条thread4 file4。但是如果我把downloadtime设置为一个数组，即使用不同的地址，就不会出现这个问题了。我想知道出现前面情况的原因是什么。

展开



 1
WL

2019-04-20

老师请问一下，我编译文章中的例子时报错： error: 'PTHREAD_CREATE_JOINABL' undeclared (first use in this function) 这个应该咋解决

展开

作者回复: 少了一个E



 1
Geek_f9e53a

2019-04-19

线程栈上的本地数据和线程私有数据有什么本质区别，能否举例说明？谢谢

作者回复: 相当于全局变量，可以在多个函数间共享，比如线程里面有多个函数



 1
罗乾林

2019-04-19

@算不出流源我的理解pthread_cond_wait 调用之后是会让出互锁，避免占着资源不干活，其他线程才可能获取该锁。对应这个例子就是Boss拿不到锁无法进行任务分配



 1
🗿顾晓峰🈹🈳�...

2019-04-19

学习了

展开



 1
ninuxer

2019-04-19

打卡day12
线程这篇，算是看懂了😂



 1
Geek_082575

2019-04-19

刘老师可以再深入讲解线程的实现吗

展开

作者回复: 后面会讲的



 1
梦里

2019-08-24

这样写文章就对了!

展开




江山未

2019-07-21

请问下老师，while (tail == head)以及下面的tasklist[head++]，这两句执行起来是原子的吗。感觉如果同时唤醒多个线程，可能会有不止一个线程通过这个条件判断啊。
还是说唤醒的同时，还要先去抢一下锁才能执行？

展开

作者回复: cond本身唤醒的时候，就会去抢锁的




kdb_reboot

2019-07-10

第三个cond.c 的例子运行起来没问题,补充pthread_cond_wait说明如下:
DESCRIPTION
The pthread_cond_wait() function blocks on the specified condition variable, which atomically releases the specified mutex and causes the calling thread to block on the condition variable The blocked thread may be awakened by a call to pthread_cond_signal() or pthread_cond_broadcast().

This function atomically releases the mutex, causing the calling thread to block on the condition variable Upon successful completion, the mutex is locked and owned by the calling thread.

展开




kdb_reboot

2019-07-10

再说第二个例子, 不加锁情况下, 我这边在main 函数的for 里面加了sleep才会出现错误值, 如果不加,则不会出现

展开

作者回复: 不加sleep就太快了，看不出来





	#include <pthread.h>
	#include <stdio.h>
	#include <stdlib.h>

	#define NUM_OF_TASKS 5

	void downloadfile(void filename)
	{
	printf("I am downloading the file %s!\n", (char *)filename);
	sleep(10);
	long downloadtime = rand()%100;
	printf("I finish downloading the file within %d minutes!\n", downloadtime);
	pthread_exit((void *)downloadtime);
	}

	int main(int argc, char *argv[])
	{
	char files[NUM_OF_TASKS][20]={"file1.avi","file2.rmvb","file3.mp4","file4.wmv","file5.flv"};
	pthread_t threads[NUM_OF_TASKS];
	int rc;
	int t;
	int downloadtime;

	pthread_attr_t thread_attr;
	pthread_attr_init(&thread_attr);
	pthread_attr_setdetachstate(&thread_attr,PTHREAD_CREATE_JOINABLE);

	for(t=0;t<NUM_OF_TASKS;t++){
	printf("creating thread %d, please help me to download %s\n", t, files[t]);
	rc = pthread_create(&threads[t], &thread_attr, downloadfile, (void *)files[t]);
	if (rc){
	printf("ERROR; return code from pthread_create() is %d\n", rc);
	exit(-1);
	}
	}

	pthread_attr_destroy(&thread_attr);

	for(t=0;t<NUM_OF_TASKS;t++){
	pthread_join(threads[t],(void**)&downloadtime);
	printf("Thread %d downloads the file %s in %d minutes.\n",t,files[t],downloadtime);
	}

	pthread_exit(NULL);
	}

	# ./a.out
	creating thread 0, please help me to download file1.avi
	creating thread 1, please help me to download file2.rmvb
	I am downloading the file file1.avi!
	creating thread 2, please help me to download file3.mp4
	I am downloading the file file2.rmvb!
	creating thread 3, please help me to download file4.wmv
	I am downloading the file file3.mp4!
	creating thread 4, please help me to download file5.flv
	I am downloading the file file4.wmv!
	I am downloading the file file5.flv!
	I finish downloading the file within 83 minutes!
	I finish downloading the file within 77 minutes!
	I finish downloading the file within 86 minutes!
	I finish downloading the file within 15 minutes!
	I finish downloading the file within 93 minutes!
	Thread 0 downloads the file file1.avi in 83 minutes.
	Thread 1 downloads the file file2.rmvb in 86 minutes.
	Thread 2 downloads the file file3.mp4 in 77 minutes.
	Thread 3 downloads the file file4.wmv in 93 minutes.
	Thread 4 downloads the file file5.flv in 15 minutes.

	[root@deployer createthread]# ./a.out
	Thread 508479232 is transfering money!
	Thread 491693824 is transfering money!
	Thread 500086528 is transfering money!
	Thread 483301120 is transfering money!
	Thread 516871936 is transfering money!
	money_of_tom + money_of_jerry = 200
	money_of_tom + money_of_jerry = 200
	money_of_tom + money_of_jerry = 220
	money_of_tom + money_of_jerry = 220
	money_of_tom + money_of_jerry = 230
	money_of_tom + money_of_jerry = 240
	Thread 483301120 finish transfering money!
	money_of_tom + money_of_jerry = 240
	Thread 508479232 finish transfering money!
	Thread 500086528 finish transfering money!
	money_of_tom + money_of_jerry = 220
	Thread 516871936 finish transfering money!
	money_of_tom + money_of_jerry = 210
	money_of_tom + money_of_jerry = 210
	Thread 491693824 finish transfering money!
	money_of_tom + money_of_jerry = 200
	money_of_tom + money_of_jerry = 200

	[root@deployer createthread]# ./a.out
	Thread 568162048 is transfering money!
	Thread 576554752 is transfering money!
	Thread 551376640 is transfering money!
	Thread 542983936 is transfering money!
	Thread 559769344 is transfering money!
	Thread 568162048 finish transfering money!
	Thread 576554752 finish transfering money!
	money_of_tom + money_of_jerry = 200
	money_of_tom + money_of_jerry = 200
	money_of_tom + money_of_jerry = 200
	Thread 542983936 finish transfering money!
	Thread 559769344 finish transfering money!
	money_of_tom + money_of_jerry = 200
	money_of_tom + money_of_jerry = 200
	Thread 551376640 finish transfering money!
	money_of_tom + money_of_jerry = 200
	money_of_tom + money_of_jerry = 200
	money_of_tom + money_of_jerry = 200
	money_of_tom + money_of_jerry = 200

	[root@deployer createthread]# ./a.out
	// 招聘三个员工，一开始没有任务，大家睡大觉
	No task now! Thread 3491833600 is waiting!
	No task now! Thread 3483440896 is waiting!
	No task now! Thread 3475048192 is waiting!
	// 老板开始分配任务了，第一批任务就一个，告诉三个员工醒来抢任务
	I am Boss, I assigned 1 tasks, I notify all coders!
	// 员工一先发现有任务了，开始抢任务
	Have task now! Thread 3491833600 is grabing the task !
	// 员工一抢到了任务 A，开始干活
	Thread 3491833600 has a task A now!
	// 员工二也发现有任务了，开始抢任务，不好意思，就一个任务，让人家抢走了，接着等吧
	Have task now! Thread 3483440896 is grabing the task !
	No task now! Thread 3483440896 is waiting!
	// 员工三也发现有任务了，开始抢任务，你比员工二还慢，接着等吧
	Have task now! Thread 3475048192 is grabing the task !
	No task now! Thread 3475048192 is waiting!
	// 员工一把任务做完了，又没有任务了，接着等待
	Thread 3491833600 finish the task A !
	No task now! Thread 3491833600 is waiting!
	// 老板又有新任务了，这次是两个任务，叫醒他们
	I am Boss, I assigned 2 tasks, I notify all coders!
	// 这次员工二比较积极，先开始抢，并且抢到了任务 B
	Have task now! Thread 3483440896 is grabing the task !
	Thread 3483440896 has a task B now!
	// 这次员工三也聪明了，赶紧抢，要不然没有年终奖了，终于抢到了任务 C
	Have task now! Thread 3475048192 is grabing the task !
	Thread 3475048192 has a task C now!
	// 员工一上次抢到了，这次抢的慢了，没有抢到，是不是飘了
	Have task now! Thread 3491833600 is grabing the task !
	No task now! Thread 3491833600 is waiting!
	// 员工二做完了任务 B，没有任务了，接着等待
	Thread 3483440896 finish the task B !
	No task now! Thread 3483440896 is waiting!
	// 员工三做完了任务 C，没有任务了，接着等待
	Thread 3475048192 finish the task C !
	No task now! Thread 3475048192 is waiting!
	// 又来任务了，这次是三个任务，人人有份
	I am Boss, I assigned 3 tasks, I notify all coders!
	// 员工一抢到了任务 D，员工二抢到了任务 E，员工三抢到了任务 F
	Have task now! Thread 3491833600 is grabing the task !
	Thread 3491833600 has a task D now!
	Have task now! Thread 3483440896 is grabing the task !
	Thread 3483440896 has a task E now!
	Have task now! Thread 3475048192 is grabing the task !
	Thread 3475048192 has a task F now!
	// 三个员工都完成了，然后都又开始等待
	Thread 3491833600 finish the task D !
	Thread 3483440896 finish the task E !
	Thread 3475048192 finish the task F !
	No task now! Thread 3491833600 is waiting!
	No task now! Thread 3483440896 is waiting!
	No task now! Thread 3475048192 is waiting!
	// 公司活越来越多了，来了四个任务，赶紧干呀
	I am Boss, I assigned 4 tasks, I notify all coders!
	// 员工一抢到了任务 G，员工二抢到了任务 H，员工三抢到了任务 I
	Have task now! Thread 3491833600 is grabing the task !
	Thread 3491833600 has a task G now!
	Have task now! Thread 3483440896 is grabing the task !
	Thread 3483440896 has a task H now!
	Have task now! Thread 3475048192 is grabing the task !
	Thread 3475048192 has a task I now!
	// 员工一和员工三先做完了，发现还有一个任务开始抢
	Thread 3491833600 finish the task G !
	Thread 3475048192 finish the task I !
	// 员工三没抢到，接着等
	No task now! Thread 3475048192 is waiting!
	// 员工一抢到了任务 J，多做了一个任务
	Thread 3491833600 has a task J now!
	// 员工二这才把任务 H 做完，黄花菜都凉了，接着等待吧
	Thread 3483440896 finish the task H !
	No task now! Thread 3483440896 is waiting!
	// 员工一做完了任务 J，接着等待
	Thread 3491833600 finish the task J !
	No task now! Thread 3491833600 is waiting!