Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
263 views
in Technique[技术] by (71.8m points)

c - How do I use extern to share variables between source files?

I know that global variables in C sometimes have the extern keyword. What is an extern variable? What is the declaration like? What is its scope?

This is related to sharing variables across source files, but how does that work precisely? Where do I use extern?

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Using extern is only of relevance when the program you're building consists of multiple source files linked together, where some of the variables defined, for example, in source file file1.c need to be referenced in other source files, such as file2.c.

It is important to understand the difference between defining a variable and declaring a variable:

  • A variable is declared when the compiler is informed that a variable exists (and this is its type); it does not allocate the storage for the variable at that point.

  • A variable is defined when the compiler allocates the storage for the variable.

You may declare a variable multiple times (though once is sufficient); you may only define it once within a given scope. A variable definition is also a declaration, but not all variable declarations are definitions.

Best way to declare and define global variables

The clean, reliable way to declare and define global variables is to use a header file to contain an extern declaration of the variable.

The header is included by the one source file that defines the variable and by all the source files that reference the variable. For each program, one source file (and only one source file) defines the variable. Similarly, one header file (and only one header file) should declare the variable. The header file is crucial; it enables cross-checking between independent TUs (translation units — think source files) and ensures consistency.

Although there are other ways of doing it, this method is simple and reliable. It is demonstrated by file3.h, file1.c and file2.c:

file3.h

extern int global_variable;  /* Declaration of the variable */

file1.c

#include "file3.h"  /* Declaration made available here */
#include "prog1.h"  /* Function declarations */

/* Variable defined here */
int global_variable = 37;    /* Definition checked against declaration */

int increment(void) { return global_variable++; }

file2.c

#include "file3.h"
#include "prog1.h"
#include <stdio.h>

void use_it(void)
{
    printf("Global variable: %d
", global_variable++);
}

That's the best way to declare and define global variables.


The next two files complete the source for prog1:

The complete programs shown use functions, so function declarations have crept in. Both C99 and C11 require functions to be declared or defined before they are used (whereas C90 did not, for good reasons). I use the keyword extern in front of function declarations in headers for consistency —?to match the extern in front of variable declarations in headers. Many people prefer not to use extern in front of function declarations; the compiler doesn't care — and ultimately, neither do I as long as you're consistent, at least within a source file.

prog1.h

extern void use_it(void);
extern int increment(void);

prog1.c

#include "file3.h"
#include "prog1.h"
#include <stdio.h>

int main(void)
{
    use_it();
    global_variable += 19;
    use_it();
    printf("Increment: %d
", increment());
    return 0;
}
  • prog1 uses prog1.c, file1.c, file2.c, file3.h and prog1.h.

The file prog1.mk is a makefile for prog1 only. It will work with most versions of make produced since about the turn of the millennium. It is not tied specifically to GNU Make.

prog1.mk

# Minimal makefile for prog1

PROGRAM = prog1
FILES.c = prog1.c file1.c file2.c
FILES.h = prog1.h file3.h
FILES.o = ${FILES.c:.c=.o}

CC      = gcc
SFLAGS  = -std=c11
GFLAGS  = -g
OFLAGS  = -O3
WFLAG1  = -Wall
WFLAG2  = -Wextra
WFLAG3  = -Werror
WFLAG4  = -Wstrict-prototypes
WFLAG5  = -Wmissing-prototypes
WFLAGS  = ${WFLAG1} ${WFLAG2} ${WFLAG3} ${WFLAG4} ${WFLAG5}
UFLAGS  = # Set on command line only

CFLAGS  = ${SFLAGS} ${GFLAGS} ${OFLAGS} ${WFLAGS} ${UFLAGS}
LDFLAGS =
LDLIBS  =

all:    ${PROGRAM}

${PROGRAM}: ${FILES.o}
    ${CC} -o $@ ${CFLAGS} ${FILES.o} ${LDFLAGS} ${LDLIBS}

prog1.o: ${FILES.h}
file1.o: ${FILES.h}
file2.o: ${FILES.h}

# If it exists, prog1.dSYM is a directory on macOS
DEBRIS = a.out core *~ *.dSYM
RM_FR  = rm -fr

clean:
    ${RM_FR} ${FILES.o} ${PROGRAM} ${DEBRIS}


Guidelines

Rules to be broken by experts only, and only with good reason:

  • A header file only contains extern declarations of variables — never static or unqualified variable definitions.

  • For any given variable, only one header file declares it (SPOT — Single Point of Truth).

  • A source file never contains extern declarations of variables — source files always include the (sole) header that declares them.

  • For any given variable, exactly one source file defines the variable, preferably initializing it too. (Although there is no need to initialize explicitly to zero, it does no harm and can do some good, because there can be only one initialized definition of a particular global variable in a program).

  • The source file that defines the variable also includes the header to ensure that the definition and the declaration are consistent.

  • A function should never need to declare a variable using extern.

  • Avoid global variables whenever possible — use functions instead.

The source code and text of this answer are available in my SOQ (Stack Overflow Questions) repository on GitHub in the src/so-0143-3204 sub-directory.

If you're not an experienced C programmer, you could (and perhaps should) stop reading here.

Not so good way to define global variables

With some (indeed, many) C compilers, you can get away with what's called a 'common' definition of a variable too. 'Common', here, refers to a technique used in Fortran for sharing variables between source files, using a (possibly named) COMMON block. What happens here is that each of a number of files provides a tentative definition of the variable. As long as no more than one file provides an initialized definition, then the various files end up sharing a common single definition of the variable:

file10.c

#include "prog2.h"

long l;   /* Do not do this in portable code */

void inc(void) { l++; }

file11.c

#include "prog2.h"

long l;   /* Do not do this in portable code */

void dec(void) { l--; }

file12.c

#include "prog2.h"
#include <stdio.h>

long l = 9;   /* Do not do this in portable code */

void put(void) { printf("l = %ld
", l); }

This technique does not conform to the letter of the C standard and the 'one definition rule' — it is officially undefined behaviour:

J.2 Undefined behavior

An identifier with external linkage is used, but in the program there does not exist exactly one external definition for the identifier, or the identifier is not used and there exist multiple external definitions for the identifier (6.9).

§6.9 External definitions ?5

An external definition is an external declaration that is also a definition of a function (other than an inline definition) or an object. If an identifier declared with external linkage is used in an expression (other than as part of the operand of a sizeof or _Alignof operator whose result is an integer constant), somewhere in the entire program there shall be exactly one external definition for the identifier; otherwise, there shall be no more than one.161)

161) Thus, if an identifier declared with external linkage is not used in an expression, there need be no external definition for it.

However, the C standard also lists it in informative Annex J as one of the Common extensions.

J.5.11 Multiple external definitions

There may be more than one external definition for the identifier of an object, with or without the explicit use of the keyword extern; if the definitions disagree, or more than one is initialized, the behavior is undefined (6.9.2).

Because this technique is not always supported, it is best to avoid using it, especially if your code needs to be portable. Using this technique, you can also end up with unintentional type punning.

If one of the files above declared l as a double instead of as a long, C's type-unsafe linkers probably would not spot the mismatch. If you're on a machine with 64-bit long and double, you'd not even get a warning; on a machine with 32-bit long and 64-bit double, you'd probably get a warning about the different sizes — the linker would use the largest size, exactly as a Fortran program would take the largest size of any common blocks.

Note that GCC 10.1.0, which was released on 2020-05-07, changes the default compilation options to use -fno-common, which means that by default, the code above no longer links unless you override the default with -fcommon (or use attributes, etc — see the link).


The next two files complete the source for prog2:

prog2.h

extern void dec(void);
extern void put(void);
extern void inc(void);

prog2.c

#include "prog2.h"
#include <stdio.h>

int main(void)
{
    inc();
    put();
    dec();
    put();
    dec();
    put();
}
  • prog2 uses prog2.c, file10.c, file11.c, file12.c, prog2.h.

Warning

As noted in comments here, and as stated in my answer to a similar <a href="https://stack


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...