Zaɓi Harshe

AECD Embedding don Ganin Ƙwayoyin Cryptomining da wuri

Wata sabuwar hanyar ganin ƙwayoyin cryptomining da wuri ta amfani da API embedding bisa rukuni da DLL (AECD) tare da TextCNN, tana samun ingantaccen inganci tare da iyakantattun jerin API na farko.
apismarket.org | PDF Size: 0.6 MB
Kima: 4.5/5
Kimarku
Kun riga kun ƙididdige wannan takarda
Murfin Takardar PDF - AECD Embedding don Ganin Ƙwayoyin Cryptomining da wuri

1. Gabatarwa & Bayyani

Ƙwayoyin cryptomining suna haifar da babbar barazana ga tsaron tsarin, suna haifar da lalacewar kayan aiki da ɓata makamashi mai yawa. Kalubalen farko na yaƙi da wannan barazanar ya ta'allaka ne a cikin samun gani da wuri ba tare da lalata inganci ba. Hanyoyin da suke akwai sau da yawa sun kasa daidaita waɗannan muhimman fannoni biyu. Wannan takarda ta gabatar da CEDMA (Hanyar Gano Ƙwayoyin Cryptomining da wuri bisa AECD Embedding), wata sabuwar hanya wacce ke amfani da jerin kiran API na farko na aiwatar da software. Ta hanyar haɗa sunayen API, rukunansu na aiki, da DLLs masu kiran su cikin wakilci mai wadata ta hanyar AECD (API Embedding bisa Rukuni da DLL) da aka tsara, sannan kuma a yi amfani da Model na TextCNN (Text Convolutional Neural Network), CEDMA tana nufin gano ayyukan ma'adinai masu cutarwa da sauri kuma da inganci mai girma.

Ingancin Gano (Samfurori da aka sani)

98.21%

Ingancin Gano (Samfurori da ba a sani ba)

96.76%

Tsawon Jerin Shigarwa

3,000 kiran API

2. Hanyar Aiki: Tsarin CEDMA

Babban ƙirƙira na CEDMA shine wakilcin fasali mai fuskoki da yawa don binciken ɗabi'a da wuri.

2.1 Tsarin AECD Embedding

Binciken jerin API na gargajiya sau da yawa yana ɗaukar kiran API a matsayin alamomi masu sauƙi. AECD tana wadata wannan wakilcin ta hanyar haɗa embeddings daga tushe uku:

  1. API Suna Embedding ($e_{api}$): Yana wakiltar takamaiman aikin da aka kira (misali, `CreateFileW`, `RegSetValueEx`).
  2. API Rukuni Embedding ($e_{cat}$): Yana wakiltar nau'in aiki mai ma'ana (misali, Tsarin Fayil, Rajista, Cibiyar Sadarwa). Wannan yana tattare da ɗabi'a, yana taimakawa gabaɗaya.
  3. DLL Embedding ($e_{dll}$): Yana wakiltar ɗakin ajiya mai haɗin kai (dynamic link library) wanda API ke kiran daga gare shi (misali, `kernel32.dll`, `ntdll.dll`). Wannan yana ba da mahallin game da yanayin aiwatarwa.

An gina ƙarshen AECD vector don kiran API $i$ kamar haka: $v_i^{AECD} = [e_{api}^{(i)} \oplus e_{cat}^{(i)} \oplus e_{dll}^{(i)}]$, inda $\oplus$ ke nuna haɗin vector. Wannan haɗin kai na uku yana ɗaukar sa hannun ɗabi'a masu zurfi daga iyakantattun bayanan aiwatarwa na farko.

2.2 Tsarin Model na TextCNN

Jerin AECD vectors (daga kiran API 3,000 na farko) ana ɗaukarsa a matsayin takarda "rubutu". An yi amfani da model na TextCNN don rarrabuwa saboda ingancinsa da ikon ɗaukar tsarin jerin gida (fasalin n-gram). Model ɗin yawanci ya ƙunshi:

  • Layer na Embedding (an fara shi da AECD vectors).
  • Layer na Convolutional da yawa tare da girman kernel daban-daban (misali, 3, 4, 5) don cire fasali daga girman "gram" daban-daban na jerin API.
  • Pooling da Layer masu cikakken haɗin kai waɗanda ke kaiwa ga sakamakon rarrabuwa na binary (mai kyau vs. ƙwayoyin cryptomining).

3. Sakamakon Gwaji & Aiki

An yi amfani da hanyar CEDMA da aka tsara a kan bayanan da suka ƙunshi iyalai daban-daban na ƙwayoyin cryptomining (waɗanda suka yi niyya ga cryptocurrencies da yawa) da samfurori daban-daban na software masu kyau.

Mahimman Bincike:

  • Ta amfani da kiran API 3,000 na farko kawai bayan aiwatarwa, CEDMA ta sami 98.21% Ingancin Gano akan samfurorin ƙwayoyin cuta da aka sani da 96.76% Ingancin Gano akan samfurorin ƙwayoyin cuta da ba a gani ba (da ba a sani ba).
  • Aikin ya nuna cewa AECD embedding ya yi nasarar rama ƙarancin bayanai da ke cikin binciken farkon lokaci ta hanyar haɗa mahallin rukuni da DLL.
  • Hanyar tana gano ƙwayoyin cuta yadda ya kamata kafin kafa haɗin cibiyar sadarwa, wanda yake da mahimmanci don ɗaukar matakan farko da hana lalacewa.

Bayanin Chati (Tunani): Chati na sanduna wanda ke kwatanta Ingancin Gano, Daidaito, da Tunawa na CEDMA (tare da AECD) da model na tushe ta amfani da API suna embeddings kawai. Chatin zai nuna fa'idodin aiki mai mahimmanci a cikin duk ma'auni na CEDMA, musamman a cikin Tunawa, yana nuna ƙarfin sa wajen gano ainihin lokutan ƙwayoyin cuta da wuri.

4. Bincike na Fasaha & Fahimta ta Asali

Fahimta ta Asali: Babban nasarar takardar ba wai kawai wani aikace-aikacen jijiyoyin jiki ba ne; yana da juyin juya halin injiniyan fasali a matakin embedding. Yayin da yawancin bincike ke bin ƙarin ƙaƙƙarfan samfura (misali, Transformers), CEDMA ta yi wayo wajen magance matsalar asali na ganowa da wuri: ƙarancin bayanai. Ta hanyar shigar da mahallin ma'ana (rukuni) da muhalli (DLL) kai tsaye cikin vector na fasali, tana wadata iyakantaccen siginar da ake samu daga gajeriyar bin diddigin aiwatarwa. Wannan yayi kama da yadda asarar daidaiton zagayowar CycleGAN (Zhu et al., 2017) ya ba da damar fassarar hoto zuwa hoto ba tare da bayanan haɗin gwiwa ba—dukansu suna magance iyakancewar bayanai ta asali tare da fahimtar gine-gine ko wakilci, maimakon kawai haɓaka girma.

Kwararar Ma'ana: Ma'anar tana da kyau a layi daya: 1) Gano da wuri yana buƙatar gajerun jerin abubuwa. 2) Gajerun jerin abubuwa ba su da ƙarfin nuna bambanci. 3) Don haka, ƙara yawan bayanai a kowace alama (kiran API). 4) Cim ma wannan ta hanyar haɗa tashoshi na bayanai masu kusurwa (takamaiman aiki, aiki gabaɗaya, ɗakin ajiya na tushe). 5) Bari samfuri mai sauƙi, mai inganci (TextCNN) ya koyi tsarin daga wannan jerin da aka wadata. Wannan bututun yana da ƙarfi saboda yana ƙarfafa shigarwa maimakon rikitar da mai sarrafa.

Ƙarfi & Kurakurai: Babban ƙarfinsa shine ingancin aikace-aikacen sa—inganci mai girma tare da ƙaramin nauyin lokacin aiki, yana sa turawa a duniyar gaske ya yiwu. Amfani da TextCNN, sabanin RNNs ko Transformers masu nauyi, zaɓi ne mai amfani wanda ya dace da buƙatar sauri a aikace-aikacen tsaro. Duk da haka, kurakuri mai mahimmanci shine yuwuwar rauni ga kiran API na adawa. Ƙwayar cuta mai zurfin ilimi na iya shigar da jerin kiran API masu kama da masu kyau daga DLLs da rukunoni "daidai" don gurɓata sararin embedding, barazanar da ba a tattauna ba. Bugu da ƙari, taga API 3,000, yayin da yake da kyakkyawan ma'auni, shine bakin kofa na sabani; ƙarfinsa a cikin rikitattun software daban-daban ya kasance don tabbatarwa.

Fahimta mai Aiki: Ga manajoji samfurin tsaro, wannan binciken tsari ne: ba da fifiko ga wakilcin fasali fiye da rikitarwar samfura don barazanar lokaci-lokaci. Ra'ayin AECD za'a iya faɗaɗa shi fiye da APIs—tuna rajistan kwararar cibiyar sadarwa (IP, tashar jiragen ruwa, ƙa'ida, tsarin girman fakitin) ko rajistan tsarin. Ga masu bincike, mataki na gaba shine ƙarfafa wannan hanyar don gujewa adawa, watakila ta hanyar haɗa maki na gano abin ban mamaki akan sararin embedding kanta. Ya kamata fannin ya ƙara aron daga binciken ML mai ƙarfi, kamar dabarun horar da adawa da aka tattauna a cikin takardu daga ma'ajiyar arXiv's cs.CR (Cryptography and Security).

5. Tsarin Bincike: Misali mai Amfani

Yanayi: Bincika wani abu mai shakku, sabon zazzagewa mai aiwatarwa.

Tsarin Aikin Binciken CEDMA:

  1. Aiwatar da Sandbox mai Ƙarfi: A gudanar da mai aiwatarwa a cikin yanayi mai sarrafawa, mai kayan aiki na ɗan gajeren lokaci (daƙiƙa).
  2. Tarin Bincike: Ƙugiya da rikodin kiran API na farko ~3,000, tare da DLLs ɗin su masu dacewa.
  3. Wadata Fasali (AECD):
    • Ga kowane kiran API (misali, `NtCreateKey`), yi tambaya zuwa taswirar da aka ƙayyade don samun rukuninsa (`Rajista`).
    • Lura da DLL mai kira (`ntdll.dll`).
    • Samar da haɗin AECD vector daga teburan embedding da aka riga aka horar don `NtCreateKey`, `Rajista`, da `ntdll.dll`.
  4. Samuwar Jerin & Rarrabuwa: Ciyar da jerin AECD vectors 3,000 cikin model na TextCNN da aka riga aka horar.
  5. Yanke Shawara: Model ɗin yana fitar da maki yuwuwar. Idan makin ya wuce bakin kofa (misali, >0.95), ana yiwa fayil ɗin alama a matsayin ƙwayar cryptomining mai yuwuwa kuma an keɓe shi kafin ya fara haɗin cibiyar sadarwa zuwa tafkin ma'adinai.

Lura: Wannan tsarin ra'ayi ne. Aiwatarwa ta gaske tana buƙatar sarrafawa mai yawa, horar da embedding, da inganta samfura.

6. Ayyuka na Gaba & Hanyoyin Bincike

  • Faɗaɗa Mahallin Embedding: Aikin gaba zai iya haɗa ƙarin mahalli, kamar hujjojin kiran API (misali, hanyoyin fayil, maɓallan rajista) ko bayanan zare/tsari, cikin tsarin embedding don ƙirƙirar bayanan ɗabi'a masu wadata.
  • Gano Tsakanin Dandamali: Daidaita ra'ayin AECD zuwa wasu dandamali (syscalls na Linux, APIs na macOS) don kare ƙarshen cikakke.
  • Gano Kwarara na Lokaci-lokaci: Aiwatar da CEDMA a matsayin mai nazarin kwarara wanda ke yin hasashe ci gaba yayin da ake samar da kiran API, yana rage ƙuntatawar taga.
  • Haɗin kai tare da Hikimar Barazana: Yin amfani da vectors na fasali da aka samu daga AECD a matsayin sa hannun yatsa don tambayar dandamali na hikimar barazana don irin wannan ɗabi'un ƙwayoyin cuta da aka sani.
  • Ƙarfin Adawa: Kamar yadda aka ambata a cikin binciken, bincika hanyoyin kariya daga ƙwayoyin cuta da aka tsara don guje wa wannan takamaiman hanyar ganowa shine muhimmin mataki na gaba.

7. Nassoshi

  1. Cao, C., Guo, C., Li, X., & Shen, G. (2024). Cryptomining Malware Early Detection Method Based on AECD Embedding. Journal of Frontiers of Computer Science and Technology, 18(4), 1083-1093.
  2. Zhu, J., Park, T., Isola, P., & Efros, A. A. (2017). Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. Proceedings of the IEEE International Conference on Computer Vision (ICCV).
  3. SonicWall. (2023). SonicWall Cyber Threat Report 2023. An samo daga gidan yanar gizon SonicWall.
  4. Berecz, T., et al. (2021). [Aikin da ya dace akan ganin ƙwayoyin cuta bisa API]. Conference on Security and Privacy.
  5. Kim, Y. (2014). Convolutional Neural Networks for Sentence Classification. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). (Takardar TextCNN ta Asali).
  6. arXiv.org, cs.CR (Cryptography and Security) category. [Ma'ajiya don sabon binciken ML na adawa da tsaro].