Microsoft: Artificial Intelligence Models are still suffering to detect software gaps
The recent period witnessed a steady increase in the use of artificial intelligence models from the Anthropic and OPEA companies and others in carrying out some programming tasks, but a new study revealed that the detection of gaps still acquires a lot of effort amid the suffering of the sector.
Sondar Pachai, CEO of Google, said last October that 25% of the company’s new codes are born by artificial intelligence, and Mita CEO Mark Tsukberg expressed his ambitions to publish widely programming models within the social media giant.
The new study of the research and development sector in the American software empire Microsoft showed that artificial intelligence models, including the model Claude 3.7 Son Net, and the Oblin -03 mini form. I failed to repair many problems, according to one of the programming criteria known as the S-DW-Bishk Light.
Tech Crash, which specializes in technology issues, stated that the results of this study are an explicit reminder that despite the great uproar caused by artificial intelligence companies such as Obin AI on the capabilities of new technology, they are still unable to replace the human element in many tasks such as programming.
The researchers, who prepared the study, tested nine different artificial intelligence models as a basis for a “single -based agent on claims” with access to a number of error correction tools, including Python’s mistake.
They assigned these models to solve a selection of 300 tasks to correct software errors according to the SW-E-Bishk Light standards. According to the authors of the study, even with the use of newer and stronger artificial intelligence models, artificial intelligence agent was unable to complete more than half of the tasks of discovering and successfully repairing software.
The model Claude 3.7 Son Net achieved the highest success rate, and reached 48.4%, followed by the Form 01 from AII by 30.2%, then the model -03 mini, with a success rate exceeding 22% only.