Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
134 views
in Technique[技术] by (71.8m points)

ksh - sed -i touching files that it doesn't change

Someone on our server ran sed -i 's/$var >> $var2/$var > $var2/ * to change inserts to overwrites in some bash scripts in a common directory. No big deal, it was tested first with grep and it returned the expected results that only his files would be touched.

He ran the script and now 1200 files of the 1400 in the folder have a new modified date, yet as far as we can tell, only his small handful of files were actually changed.

  1. Why would sed 'touch' a file that it's not changing.
  2. Why would it only 'touch' a portion of the files and not all of them.
  3. Did it actually change something (maybe some trailing white space or something totally unexpected because of the $'s in the sed regex)?
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

When GNU sed successfully edits a file "in-place," its timestamp is updated. To understand why, let's review how edit "in-place" is done:

  1. A temporary file is created to hold the output.

  2. sed processes the input file, sending output to the temporary file.

  3. If a backup file extension was specified, the input file is renamed to the backup file.

  4. Whether a backup is created or not, the temporary output is moved (rename) to the input file.

GNU sed does not track whether any changes were made to the file. Whatever is in the temporary output file is moved to the input file via rename.

There is a nice benefit to this procedure: POSIX requires that rename be atomic. Consequently, the input file is never in a mangled state: it is either the original file or the modified file and never part way in-between.

As a result of this procedure, any file that sed successfully processes will have its timestamp changed.

Example

Let's consider this inputfile:

$ cat inputfile
this is
a test.

Now, under the supervision of strace, let's run sed -i on it in a way guaranteed to cause no changes:

$ strace sed -i 's/XXX/YYY/' inputfile

The edited result looks like:

execve("/bin/sed", ["sed", "-i", "s/XXX/YYY/", "inputfile"], [/* 55 vars */]) = 0
[...snip...]
open("inputfile", O_RDONLY)             = 4
[...snip...]
open("./sediWWqLI", O_RDWR|O_CREAT|O_EXCL, 0600) = 6
[...snip...]
read(4, "this is
a test.
", 4096)     = 16
write(6, "this is
", 8)                = 8
write(6, "a test.
", 8)                = 8
read(4, "", 4096)                       = 0
[...snip...]
close(4)                                = 0
[...snip...]
close(6)                                = 0
[...snip...]
rename("./sediWWqLI", "inputfile")      = 0

As you can see, sed opens the input file, inputfile, on file handle 4. It then creates a temporary file, ./sediWWqLI on file handle 6, to hold the output. It reads from the input file and writes it unchanged to the output file. When this is done, a call to rename is made to overwrite inputfile, changing its timestamp.

GNU sed source code

The relevant source code is in the execute.c file of the sed directory of the source. From version 4.2.1:

  ck_fclose (input->fp);
  ck_fclose (output_file.fp);
  if (strcmp(in_place_extension, "*") != 0)
    {
      char *backup_file_name = get_backup_file_name(target_name);
      ck_rename (target_name, backup_file_name, input->out_file_name);
      free (backup_file_name);
    }

  ck_rename (input->out_file_name, target_name, input->out_file_name);
  free (input->out_file_name);

ck_rename is a cover function for the stdio function rename. The source for ck_rename is in sed/utils.c.

As you can see, no flag is kept to determine whether the file actually changed or not. rename is called regardless.

Files whose timestamps were not updated

As for the 200 of the 1400 files whose timestamps did not change, that would mean that sed somehow failed on those files. One possibility would be a permissions issue.

sed -i and Symbolic Links

As noted by mklement0, applying sed -i to a symbolic link leads to a surprising result. sed -i does not update the file pointed to by the symbolic link. Instead, sed -i overwrites the symbolic link with a new regular file.

This is a result of the call that sed makes to the STDIO rename. As documented by man 2 rename:

if newpath refers to a symbolic link the link will be overwritten.

mklement0 reports that this is also true of the (BSD) sed on Mac OSX 10.10.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...