We experienced an issue where our default email service was unable to connect to our SMTP server, resulting in email delivery failures. The incident started at 15:06:12 on Nov 24, 2025 (BRT, Brasília time | UTC: 18:06:12).
The email queue is now empty and all pending emails have been delivered. The incident has been resolved and all services are operating normally.
The email backlog is currently being processed and pending emails are being sent out. We will provide further updates in 30 minutes.
The e-mail server issue has been resolved. The service has reconnected and resumed sending emails. We are now processing the backlog to ensure all pending emails are delivered.
Our team is currently working on adjusting the email service configurations and processing the backlog of emails. We are actively monitoring the situation to ensure normal operation is fully restored.
Idea submission for validation, access, and editing are experiencing issues
A recently deployed code version caused a failure in the idea submission and idea form access processes. As a mitigation, we rolled back to the last stable version while we work on fixing the latest release
Mitigation successfully applied. Idea submission, access, and editing have been restored to normal.
We are applying a mitigation: rolling back to the last stable version
At around 05:04 BRT, we identified an issue affecting idea submission for validation, access, and editing.
Cloudflare global incident impacting Initiatives, Strategy, and Generative AI services.
Cloudflare issue closed.
A fix has been implemented and we believe the incident is now resolved. We are continuing to monitor for errors to ensure all AEVO services are back to normal.
The issue has been identified and a fix is being implemented. There is still no forecast for the complete mitigation of the incident. We continue to monitor and track the resolution status.
Between 9:48 AM and 10:52 AM, production server 1 experienced slowdowns, reaching a maximum response time of 3.83 to 4.03 minutes.
Server stability recorded at 15:21. It remains under monitoring.
Azure Web App Diagnostics The application maintained an average availability of 98.74% over 23.75 hours, processing 261,168 requests with 2,871 successful requests and 3,297 server errors (5xx and ClientTimeout). There were 3,245 total request failures, primarily HTTP 400.604 errors (2,057 occurrences, 63.39%), indicating client requests were cancelled or timed out before completion. Additionally, 898 HTTP 500 errors (27.67%) were recorded, mostly due to Azure request timeouts exceeding 230 seconds. HTTP 502 errors accounted for 149 occurrences (4.59%), signaling issues with application startup, gateway configuration, or backend connectivity. High latency was detected, with a 95th percentile average of 57,655 ms, pointing to performance bottlenecks likely related to the Azure App Service Platform and application code errors. Memory dumps were captured and analyzed, though some attempts failed due to repeated collections. No exceptions were found, but an analyzer error of System.OperationCanceledException was noted during memory dump analysis. Root causes are primarily performance-related rather than direct application code faults.
We have identified the need to perform a complete replacement of one of the affected servers to fully mitigate the problem. The estimated time for this task is 30 minutes. During this period, there may be temporary service interruptions.
New increase in response time at 1:54 PM, reaching 3.84 minutes.
Server stability recorded at 12:16. It remains under monitoring.
Multiple occurrences of the exception were recorded: Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and the max pool size was reached. Indicating that all available connections in the pool with SQL were exhausted. No connection was being released in time, so the requests got stuck waiting, causing slowness and timeout.
Global Network Issue - All services impacted
All AEVO Innovate system are operational. Azure Front Door issue resolved.
All services are operational again. We continue to monitor the status of the incident on Microsoft Azure Front Door and our services.
We are observing a partial recovery. We have noticed that the AEVO Innovate environments are back online, although some instability may still occur. According to Microsoft's most recent official statement on the incident:
"Azure Front Door - Connectivity Issues - Monitoring Recovery
Starting at approximately 16:00 UTC on October 29, 2025, Microsoft customers and services using Azure Front Door (AFD) may have experienced latencies, timeouts, and errors. We have confirmed that an inadvertent configuration change was the event that triggered this issue.
Current Status:
We have begun deploying our last known good configuration, which has now been successfully completed. Customers may have begun to see early signs of recovery. We are currently restoring nodes and routing traffic through healthy nodes, and as we progress through this workflow, customers will continue to see improvements.
Customer configuration changes will remain temporarily locked while we continue mitigation efforts. We will notify customers as soon as this lock is lifted.
Some customers may also have experienced issues accessing the Azure management portal." We have decoupled the AFD portal to mitigate these access issues. Customers should now be able to access the Azure portal directly. While most portal extensions are functioning as expected, a small number of endpoints (e.g., Marketplace) may still experience intermittent loading issues.
We currently anticipate full mitigation within the next four hours as we continue to recover the nodes. This means we expect recovery to occur by 11:20 PM UTC on October 29, 2025. We will provide another update on our progress within two hours, or sooner if necessary.
While we are observing signs of recovery and have an estimated timeline, customers may also consider implementing existing failover strategies using Azure Traffic Manager to redirect Azure Front Door traffic to their origin servers as an interim measure.
Learn more about Azure Front Door failover strategies for AFD: https://learn.microsoft.com/en-us/azure/architecture/guide/networking/global-web-applications/overview
This message was last updated at 19:57 UTC on October 29, 2025."
AEVO will be monitoring the network.
We are observing initial signs of recovery in the Azure Front Door (AFD) global network. Some AEVO Innovate environments are back online. Instabilities may still be present. We continue to monitor and address the failover. More updates in 30 minutes.
We are trying a traffic redirection strategy that bypasses Azure Front Door. We haven't been successful yet, but our engineering team continues to implement the actions. More updates in 30 minutes.
AEVO is implementing a failover strategy in an attempt to restore services. More updates in 30 minutes
MS update:
"Azure Network Availability Issues
Starting at approximately 16:00 UTC, customers and Microsoft services that leverage Azure Front Door (AFD) may have experienced issues resulting in latencies, timeouts and errors. We have confirmed that an inadvertent configuration change as the trigger event for this issue.
We are taking several concurrent actions: Firstly where we are blocking all changes to the AFD services, this includes customer configuration changes as well. At the same time, we are rolling back our AFD configuration to our last known good state. As we rollback we want to ensure that the problematic configuration doesn't re-initiate upon recovery.
Customers may have experienced problems accessing the Azure management portal. We have failed the portal away from AFD to mitigate the portal access issues. Customers should be able to access the Azure management portal directly, while all portal extensions are working correctly there may be a small number of endpoints that might have a problem loading (i.e. Marketplace).
We do not have an ETA for when the rollback will be completed, but we will update this communication within 30 minutes or when we have an update.
This message was last updated at 17:46 UTC on 29 October 2025"
Ms update: "Azure Portal Access Issues Starting at approximately 16:00 UTC, we began experiencing Azure Front Door issues resulting in a loss of availability of some services. In addition, customers may experience issues accessing the Azure Portal. Customers can attempt to use programmatic methods (PowerShell, CLI, etc.) to access/utilize resources if they are unable to access the portal directly. We have failed the portal away from Azure Front Door (AFD) to attempt to mitigate the portal access issues and are continuing to assess the situation. We are actively assessing failover options of internal services from our AFD infrastructure. Our investigation into the contributing factors and additional recovery workstreams continues. More information will be provided within 60 minutes or sooner. This message was last updated at 17:04 UTC on 29 October 2025"
We are currently experiencing the impact of a global network issue affecting Microsoft Azure, which is impacting all services connected to AEVO Innovate.
Official Statement from Microsoft:
“As of approximately 16:00 UTC, we began experiencing DNS issues, leading to reduced availability for some services. Customers may encounter difficulties accessing the Azure Portal. We have taken actions that should soon resolve access issues to the portal. We are actively investigating the root cause and implementing additional mitigation measures. Further updates will be provided within the next 60 minutes.”