Dyżurnet.pl – a team whose task is to block illegal materials on the Internet, in particular related to the sexual abuse of children, operates at the NASK National Research Institute. You should report any kind of content you find online.
“Dyżurnet.pl moderators spend many hours every day reviewing illegal content – either reported by users or indicated by scrapers, i.e. algorithms that search the web for materials with specific parameters” – says Dr. Inez Okulska from NASK’s Artificial Intelligence Division. He explains that there is a lot of such reported content, and someone always needs to review it to assess whether the material really contains illegal content and whether it should be blocked. And the person who shared it – to be prosecuted.
Artificial intelligence will capture duplicate pedophilic content
Martyna Różycka, head of Dyżurnet.pl, explains that the most urgent task is to identify materials that have not been previously reported and which show sexual abuse of children. If such materials were created recently, then perhaps some child is still being harmed. It is then necessary to find the perpetrator as soon as possible to protect the potential victim. “Most of the materials that need to be blocked, however, are content that was created years ago, but is still copied and shared in other places” – says Różycka.
In order to improve the work of moderators – and protect them from contact with psychologically burdensome content – NASK, in cooperation with the Warsaw University of Technology, decided to use artificial intelligence. The algorithm developed by researchers as part of the APAKT project is designed to automatically analyze illegal content that is sent to moderators and propose the order in which to deal with the reports – starting with those requiring the fastest intervention.
For example, the program will be able to indicate with 90% certainty that a given file resembles previously known material. Thanks to this, the moderator will be able to quickly assess and confirm whether he agrees with the program’s evaluation. This will not only save time, but also protect the mental condition of the moderators, who will not have to compare themselves whether something similar has actually appeared in the database before.
This model is not – unlike many AI systems – just a “black box” that spits out answers without any way to control where its decision came from. APAKT will be able to explain why it found the material pedophilic. What is not obvious – this system is able to classify not only video materials and photos, but also narrative texts describing the sexual abuse of children. “Presenting pedophilia on the video and photo level is illegal in Poland. However, when it comes to the text, this issue is not precisely regulated in Polish law” – points out Dr. Okulska.
The researcher mentions that there are two types of problematic texts related to pedophilia. One of them is grooming texts. “It’s about the solicitation of a minor by an adult in order to obtain nude photos or induce sexual contact” – says Dr. Okulska and explains that grooming is punishable and such cases are dealt with by the prosecutor’s office.
However, he adds that there is another problem – texts that tell stories about erotic or pornographic content, but with the participation of minors. “In such texts, a lot of things happen at school, between an elderly person and a child. There are also a lot of incest threads” – described Dr. Okulska. And he adds that the way content is presented in many of these types of texts is harmful.
“Moderators have no doubts – such texts are socially harmful. These content promote pedophilia, behavior that is absolutely unacceptable and should never be normalized. In our opinion, it should be clearly stated that this content is illegal” – judges Martyna Różycka.
And Dr. Okulska explains: “Not only are such texts badly written and scribbled, but the story is usually built in them gradually. Before the moderator can figure out whether a given text is innocent or promotes pedophilia, he has to read a lot, empathize into heroes. The content he reads stays in his memory. And these are unpleasant, heavy topics. Moderators have psychological care, but it was natural to want to introduce models that will make the moderators’ work easier.”
The researcher explains that the APAKT program will be able to point out to the moderator specific fragments of the text that prove that the material actually describes sexual scenes involving minors. This program is supposed to select malicious elements on its own.
Work on AI to search for pedophilia was not easy
Dr. Okulska says that the work on the APAKT program was so complicated that the algorithm had to be trained on pedophile materials, the storage of which is illegal or controversial. And the only team in Poland that can directly analyze child pornography is Dyżurnet.pl (in accordance with the Act on the National Cybersecurity System).
“Scientists creating models could not and did not want to have access to Dyżurnet.pl data, on which the algorithms were learning. And you can guess that creating algorithms that classify certain objects without the possibility of suspecting these objects is very difficult. It’s blindfolded work” – says Dr Okulska.
The expert adds, however, that thanks to this limitation, an innovative way of representing text for the purposes of AI was developed in the part concerning written materials. As he describes, expertly configured systems allow for high-quality classification, but at the same time they are not based only on the meaning of the analyzed text, but also on its grammatical and statistical level. In the context of such difficult topics, they are therefore “safe” for the researcher.
He adds that artificial intelligence does not work at the data collection stage yet, but at the stage of selecting materials initially selected for evaluation. The research is carried out as part of a grant from the National Center for Research and Development, the program is to be ready for use within a few months. The effectiveness is currently estimated at about 80 percent.
Researchers also hope that the APAKT program will be of interest to Internet providers or owners of large portals, which – in the light of the draft of the new act – are to be responsible for blocking access to pornography by minors. But the program can also be useful to the police or forensic experts.
Foreign institutions dealing with the removal of pedophile content on the Internet may also be interested in the program. APAKT can detect pedophilia in video and photos regardless of language. When it comes to detecting content in texts – the program works only in Polish for now, while the used in it, among others, RoBERT model or StyloMetrix vectors are currently available in both English and Ukrainian.
(PAP)
Source: Gazeta

Mabel is a talented author and journalist with a passion for all things technology. As an experienced writer for the 247 News Agency, she has established a reputation for her in-depth reporting and expert analysis on the latest developments in the tech industry.