I have an Azure Durable Function developed using Python (plus some other related libraries). The function Has a typical architecture:
contains an HTTP trigger which invokes the orchestrator. The orchestrator invokes some Activity Functions (some activity functions contain requests-POST and GET within them) along the way. The purpose of the whole function is to receive an image and read the characters on the image (OCR) (Azure Vision OCR is used) and give the result back.
The logic is like this:
Start
Section 1: (Takes about 30secs on my local machine)
- Get the image
- Process it (image processing techniques)
- Send it to Azure OCR (I have an activity for this, lets call it OCRActivity)
- Get the result
- See if the result is good enough (I have some conditions for this)
- If it's good, return the result (Using a call-back activity)
- If it's not, go to section 2
Section 2: (Takes about 2mins on my local machine)
- Make 20 different images by applying various processing techniques to the original image
- Send those 20 images to Azure OCR using a fan-out architecture (multiple "OCRActivity" activities in parallel)
- Get the results
- Compare results with the result from Section 1 and choose the best one
- Return the result (Using a call-back activity)
End
(The whole process takes about 3-4 mins on my local machine)
The issue is, everything works as expected on my local machine (Using Azure Storage Emulator etc.) and when it is deployed into production, some requests get processed and a response is received perfectly - (typically, these are the images which return a result after Section 1 without going into Section 2). Some requests run for a long time (Almost all of them are the ones that go into Section 2 and uses the fan-out to process the 20 different images) and return a response after 40mins or so (even more in some cases). Sometimes, a response is not received at all.
The exceptions that are received are not related to my code logic but of durable-function related ones which are hard to debug. Any help would be appreciated.
question from:
https://stackoverflow.com/questions/65920965/azure-durable-function-fails-after-deploying-to-production-python 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…