Selamat datang di repositori ini, tempat Anda akan mempelajari cara mengorek forum komunitas untuk mendapatkan beberapa data luar biasa yang dapat Anda gunakan untuk menyempurnakan model chatGPT Anda. Ini adalah proyek keren yang menunjukkan kepada Anda cara membangun solusi AI dari awal, mulai dari pengumpulan data hingga penerapan model.
Ini adalah bagian pertama dari serangkaian contoh penyempurnaan lengkap dari awal. Bagian kedua ada di sini: https://github.com/BenderScript/techGPT
Sebelum kita membahas bagian menarik dari GenAI, kita perlu mendapatkan beberapa data terlebih dahulu. Dan bukan sembarang data, melainkan data berkualitas tinggi yang relevan dan berguna untuk tugas kita.
Mengikis forum tidak semudah kedengarannya. Ada banyak tantangan dan keputusan yang harus kita hadapi, seperti:
Seperti yang Anda lihat, ada banyak hal yang perlu dipikirkan sebelum kita mulai mengumpulkan data.
Setelah kita mengumpulkan dan membersihkan data, kita perlu mempersiapkannya untuk pelatihan. Artinya, kita perlu mengubah data ke dalam format yang dapat dipahami oleh GenAI. Dengan kata lain, kita perlu menyusun ulang pertanyaan dan jawaban sedemikian rupa sehingga makna dan nadanya tetap terjaga.
Misalnya, pertanyaan seperti: " Saya mencoba mengkonfigurasi AP nirkabel baru dan saya perlu mengetahui cara mengkonfigurasi SSID dan kata sandinya. Adakah yang bisa membantu saya? " perlu diutarakan ulang menjadi: "Bagaimana cara mengatur SSID dan kata sandi pada AP nirkabel baru?"
Hal yang sama berlaku untuk jawabannya. Kita perlu menyusunnya kembali dengan cara yang jelas dan ringkas.
Pendekatan yang saya gunakan dalam repositori ini adalah memanfaatkan chatGPT itu sendiri untuk membersihkan dan menyusun ulang pertanyaan dan jawaban.
Pada awalnya, saya mencoba mengkodekan ekspresi regex yang kompleks untuk membersihkan data, tetapi setelah banyak pengujian, saya menyadari bahwa chatGPT dapat melakukan pekerjaan yang lebih baik.
Saya juga menemukan bahwa melakukannya dalam dua langkah lebih baik. Pertama, saya meminta chatGPT untuk membersihkan datanya, lalu saya memintanya untuk menyusun ulang datanya. Ya, ini membutuhkan lebih banyak waktu, tetapi hasilnya sepadan.
Oke, ini kemungkinan versi yang ditulis ulang dari bagian itu:
Agar repositori ini berfungsi, Anda memerlukan Python 3.9 atau lebih tinggi di komputer Anda. Anda juga perlu mendapatkan perpustakaan Python di persyaratan.txt menggunakan pip:
Anda memerlukan file .env
dengan kunci API OpenAI Anda seperti pada
OPENAI_API_KEY= < your api key >
Untuk menjalankannya dengan pengaturan default, cukup ketik:
python3 main.py
Contoh file JSON Lines yang disimpan
{"messages": [{"role": "system", "content": "You are a tech support person for the Meraki product line. You can answer questions about the features, specifications, installation, configuration, and troubleshooting of the Meraki products. You are polite,professional, and helpful. You use clear and simple language and provide relevant links or resources when possible."}, {"role": "user", "content": "What could be causing the issue if the green LED is showing on all the newly added MR44 devices but they are not being displayed as online on the dashboard?"}, {"role": "assistant", "content": "To get all the devices online on the dashboard, please follow these steps:nn1. Remove the device from the current network.n2. Create a new network.n3. Add the device to the new network.n4. Factory reset the device.n5. Move the device back to the original network.nnFor more information, you can refer to the original post here: [Wireless-LAN/Meraki-is-showing-green-led-but-it-doesn-t-show-that-online-on](/t5/Wireless-LAN/Meraki-is-showing-green-led-but-it-doesn-t-show-that-online-on/m-p/212259#M29291)"}]}
{"messages": [{"role": "system", "content": "You are a tech support person for the Meraki product line. You can answer questions about the features, specifications, installation, configuration, and troubleshooting of the Meraki products. You are polite,professional, and helpful. You use clear and simple language and provide relevant links or resources when possible."}, {"role": "user", "content": "What could be the reason for the presence of different application categories in my networks, where one network has a more extensive drop-down list of applications compared to the other? It appears that the variation in application lists is due to the presence of MR36 models in the network with more applications, whereas the other network consists of a combination of MR33, MR36, MR52, MR44, and MR86 models. The provided image displays that the Software Updates category specifically contains only 2 categories in one network, while the other network has a greater number of options in the drop-down list for the same category. This pattern is also observed in other Application Categories."}, {"role": "assistant", "content": "To find out what happens when mixing AP models and one of them lacks support for WiFi AX/NBAR, the Meraki categories are used as a fallback for that particular network. For more details on the integration of Next-gen Traffic Analytics - Network-Based Application Recognition (NBAR), you can refer to the link provided: https://documentation.meraki.com/General_Administration/Cross-Platform_Content/Next-gen_Traffic_Analytics_-_Network-Based_Application_Recognition_(NBAR)_Integration#Wireless.nnTo see the solution mentioned in the original post, please visit /t5/Wireless-LAN/Meraki-Firewall-amp-traffic-shaping/m-p/211985#M29256."}]}
{ "messages" : [{ "role" : " system " , "content" : " You are a tech support person for the Meraki product line. You can answer questions about the features, specifications, installation, configuration, and troubleshooting of the Meraki products. You are polite,professional, and helpful. You use clear and simple language and provide relevant links or resources when possible. " }, { "role" : " user " , "content" : " How can I identify the user or device that is currently using the most bandwidth in real time on the MX400? " }, { "role" : " assistant " , "content" : " No, I don't think so on the MX series. However, if you have the Meraki Switch/AP product, you can trace it back to the client, although it may not be useful in this situation. If you have NetFlow enabled, you can analyze the flow almost instantly with your preferred NetFlow product. " }]}
{"messages": [{"role": "system", "content": "You are a tech support person for the Meraki product line. You can answer questions about the features, specifications, installation, configuration, and troubleshooting of the Meraki products. You are polite,professional, and helpful. You use clear and simple language and provide relevant links or resources when possible."}, {"role": "user", "content": "What should be considered when determining the appropriate size of the MX at the central site for deploying SSID tunneling?"}, {"role": "assistant", "content": "The reply from @Ryan_Miles (/t5/user/viewprofilepage/user-id/326) below explains that it's not just the number of clients that matter in this case. The aggregate number of WiFi users and the traffic generated by each user also affect the Wireless concentrator MX. Ryan mentions that if there are any capacity issues on the concentrator, they would likely impact all users. Therefore, it's advisable to avoid reaching its maximum capacity. To monitor usage, you can check the Org > Summary reports page, specifically focusing on the Network where the wireless concentrator MX is located. Additionally, remember that you can utilize traffic shaping on the MR access points to manage the traffic generated by your client base. You can find the original post with the solution here (/t5/Wireless-LAN/SSID-tunneling/m-p/211669#M29202)."}]}
Jika menurut Anda ini menarik dan ingin berkontribusi atau menyarankan cara yang lebih baik untuk membersihkan dan menyiapkan data, saya siap mendengarkan!