Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
2.0k views
in Technique[技术] by (71.8m points)

powershell - Concurrent read/write access to XML

I am trying to implement information updates in an XML file, by multiple processes running on multiple machines somewhat concurrently. My thinking is to loop for 10 minutes, trying to open and lock the file for writing at random intervals up to 1 second. Once the file is open and locked, I load all the XML, add information for the current machine, sort the XML then resale and remove the lock, so the next machine can open. The problem is that Get-Content doesn't lock the file, so two machines could load the same XML, rather than the second loading XML with data from the first. I found this, which provides a way to lock the file, then read by stream, but when I tried modifying to this

$file = [IO.File]::Open($path, 'Open', 'ReadWrite', 'None')
$xml = Get-Content $path

I get an error because the file is locked. It seems that Get-Content doesn't lock the file, but it does respect a lock that is already there. So, is there a way to lock the file so only the machine locking can read and write? And perhaps more importantly, is this even the right approach, or is there some other approach to multiple XML access? It seems like this would be a common scenario, so there must be some best practice way to do it, even if there isn't a native cmdlet approach. FWIW, I have to support back to PowerShell 2.0, which no doubt constrains how I can approach this.

EDIT: Well, it doesn't seem like Read for the third parameter in the [io.file] bit is working. I now have this

$path = '\PxSupportPx ToolsResourcesjobs.xml'
foreach ($i in 1..10) {
    $sleepTime = get-random -minimum:2 -maximum:5
    $file = [IO.File]::Open($path, 'Open', 'ReadWrite', 'Read')
    [xml]$xml = Get-Content $path

    $newNode = $xml.createElement('Item')
    $newNode.InnerXml = "$id : $i : $sleepTime : $(Get-Date)"
    $xml.DocumentElement.AppendChild($newNode) > $null
    $xml.Save($path)
    $file.Close()
}

Which in theory should take the XML I have, with two dummy log items, read it, append another log item (with an ID, the iteration, the sleep time and the time stamp) and repeat 10 times, with random sleeps between. It poops the bed trying to save with

"The process cannot access the file '\PxSupportPx ToolsResourcesjobs.xml' because it is being used by another process."

Am I really trying to do something that hasn't been done 1000 times before?

OK, based on what comments, here is where I am at. I want to make sure the original cannot be (easily) edited manually while processing is going on. So I have implemented this. 1: Look for a sentinel file and if not found 2: Lock the original file so it can't be modified 3: Copy the original to be the sentinel file 4: Modify the sentinel file as needed 5: Unlock the original 6: Copy the sentinel file over the original 7: Delete the sentinel

Seems to me the iffy bit is just if someone manually modifies the original between unlocking it and the sentinel getting copied, which is highly unlikely. But, it seems like there should be a way to handle this with 100% certainty, and I can't think of a way with or without sentinel files.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

On a general note: files aren't optimized for concurrent access the way databases, are, so if you need concurrent access with some sophistication, you'll need to roll your own.

This answer to a closely related question demonstrates use of a separate lock file (sentinel file) for managing concurrency with minimal disruption.

However, you can simplify the approach and obviate the need for a lock file if you're willing to put an exclusive lock on the file for the entire duration of reading it, modifying it, and saving the modifications.

By contrast, the lock-file approach allows reading and preparing modifications concurrently with other processes reading the file, and only requires the exclusive lock for the actual act of rewriting / replacing the file.

With both approaches, however, a period of exclusive locking of the file is required, so as to prevent the unpredictability of readers reading from a file while it is being rewritten.

That said, you still need cooperation from all processes involved:

  • Writers need to deal with the (temporary) inability to open the file exclusively, namely while other processes (readers or writers) are using it.

  • Similarly, readers must be prepared to handle the (temporary) inability to open the file (while it is being updated by a writer).

The key is to:

  • Open the file with file-share mode None (i.e., deny other processes use of the same file while you have it open), and to keep it open until updating has completed. This ensures that the operation is atomic from a cross-process perspective.

  • Use only the FileStream instance returned by [System.IO.File]::Open() to read from and write to the file (calling cmdlets or .NET methods such as System.Xml.XmlDocument.Save() will fail, because they themselves will try to open the - then exclusively locked - file).


Here's a fixed version of your code that implements exclusive locking:

$path = '\PxSupportPx ToolsResourcesjobs.xml'
foreach ($i in 1..10) {

    $sleepTime = get-random -minimum:2 -maximum:5

    # Open the file with an exclusive lock so that no other process will be
    # be able to even read it while an update is being performed.
    # Use a RETRY LOOP until exclusive locking succeeds.
    # You'll need a similar loop for *readers*.
    # Note: In production code, you should also implement a TIMEOUT.
    do {  # retry loop
      try {
        $file = [IO.File]::Open($path, 'Open', 'ReadWrite', 'None')
      } catch {
        # Did opening fail due to the file being LOCKED? -> keep trying.
        if ($_.Exception.InnerException -is [System.IO.IOException] -and ($_.Exception.InnerException.HResult -band 0x21) -in 0x21, 0x20) { 
          $host.ui.Write('.') # Some visual feedback
          Start-Sleep -Milliseconds 500 # Sleep a little.
          continue # Try again.
        }
        Throw # Unexpexted error -> rethrow.
      }
      break # Opening with exclusive lock succeeded, proceed below.
    } while ($true)


    # Read the file's content into an XML document (DOM).
    $xml = New-Object xml # xml is a type accelerator for System.XML.XMLDocument
    $xml.Load($file)

    # Modify the XML document.
    $newNode = $xml.createElement('Item')
    $newNode.InnerXml = "$id : $i : $sleepTime : $(Get-Date)"
    $null = $xml.DocumentElement.AppendChild($newNode)

    # Convert the XML document back to a string
    # and write that string back to the file.
    $file.SetLength(0) # truncate existing content first
    $xml.Save($file)

    # Close the file and release the lock.
    $file.Close()
}

As for what you tried:

$file = [IO.File]::Open($path, 'Open', 'ReadWrite', 'Read') opens the file in a manner that allows other processes read access, but not write access.

You then call $xml.Save($path) while $file is still open, yet that method call - which itself tries to open the file too - requires write access, which fails.

As shown above, the key is to use the same $file (FileStream instance used to open the file exclusively for updating the file.

Also note that calling $file.Close() just before $xml.Save($path) is not a solution, because that introduces a race condition where another process could open the file in the time between the two statements.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...