译者 | 李睿
审校 | 重楼
本文将展示如何使用由Apache NiFi支持的Cloudera DataFlow与IBM WatsonX交互。人工智能实时建立大型语言模型,并且可以使用任何基础模型,例如谷歌FLAN T5 XXL或IBM Granite模型。
本文将展示构建实时数据管道是多么容易,它可以直接向开发人员的Slack和移动应用程序提供问题,以确保WatsonX的安全。在IBM云平台中运行的人工智能模型。开发人员可以使用Cloudera DataFlow处理所有的安全性、管理、沿袭和治理。作为决策的一部分,可以选择不同的WatsonX。人工智能根据提示的类型进行动态建模。例如,如果想编写内容,而不是回答一个问题,可以选择不同的模型。回答谷歌FLAN T5 XXL的问题效果很好。如果想继续编写内容,可以使用一个IBM Granite模型。
人们会注意到WatsonX的速度有多快。人工智能模型返回开发人员需要的结果。这里进行了一些快速的转换和丰富内容,然后将它们发送到Cloudera Apache Kafka,用于持续分析和分发到许多其他应用程序、系统、平台和下游消费者。此外,还将向原始请求者输出答案,这可能是Slack频道中的某一个或应用程序中的某一个。所有这些都是实时发生的,没有代码,没有完整的治理、沿袭、数据管理和任何规模、任何平台上的安全性。
IBM和Cloudera在私有云、公共云和混合云环境中对实时数据和人工智能的共同作用才刚刚开始。
逐步实时流程
首先,在Slack中输入一个问题:“整合生成式AI和Apache NiFi的一个好方法是什么?”
NiFi Flow Top
一旦输入这个问题,Slack服务器将这些事件发送到注册的服务。它可以在任何面向公众的地方托管。
- (点击https://api.slack.com/查看Slack API链接)
Slack API
一旦启用,服务器将开始接收每个Slack帖子的JSON事件。这在NiFi中很容易接收和解析。Cloudera DataFlow允许在公共云托管版本中轻松接收安全的HTTPS REST调用,即使在Designer模式下也是如此。
NiFi Top Flow 2
在NiFi Top Flow的第一部分中,接收到REST JSON Post,如下所示:
Slackbot 1.0 (+https://api.slack.com/robots)
application/json
POST
HTTP/1.1
{
"token" : "qHvJe59yetAp1bao6wmQzH0C",
"team_id" : "T1SD6MZMF",
"context_team_id" : "T1SD6MZMF",
"context_enterprise_id" : null,
"api_app_id" : "A04U64MN9HS",
"event" : {
"type" : "message",
"subtype" : "bot_message",
"text" : "==== NiFi to IBM <http://WatsonX.AI|WatsonX.AI> LLM Answers\n\nOn Date: Wed, 15 Nov 20
这是一个非常丰富详细的JSON文件,可以立即将其作为JSON文档推送到Apache Iceberg Open Cloud Lakehouse、Kafka主题或对象存储(增强选项)。它只是解析开发人员所需要的。
EvaluateJSONPath
解析出通道ID和帖子的文本。如果只想要一般的消息(“C1SD6N197”),然后将文本复制到一个输入字段,这是Hugging Face.所需的。
然后,检查输入:如果是股票或天气(更多信息),避免调用LLM。
SELECT * FROM FLOWFILE
WHERE upper(inputs) like '%WEATHER%'
AND not upper(inputs) like '%LLM SKIPPED%'
SELECT * FROM FLOWFILE
WHERE upper(inputs) like '%STOCK%'
AND not upper(inputs) like '%LLM SKIPPED%'
SELECT * FROM FLOWFILE
WHERE (upper(inputs) like 'QUESTION:%'
OR upper(inputs) like 'Q:%') and not upper(inputs) like '%WEATHER%'
and not upper(inputs) like '%STOCK%'
对于库存处理:
为了解析需要的库存,使用Open NLP处理器来获取它。
因此,需要下载处理器和实体提取模型。
- Github - tspannhw/nifi-nlp-processor: Apache NiFi NLP Processor
- Open NLP Example Apache NiFi Processor
然后,将公司名称从AlphaVantage传递给HTTP REST端点,该端点将公司名称转换为股票代码。在免费账户中,每天只接到几个电话,所以如果失败了,就会跳过这一步,尝试使用输入的任何内容。
使用RouteOnContent,可以过滤掉错误消息。
然后,使用QueryRecord处理器将CSV转换为JSON并进行过滤。
SELECT name as companyName, symbol FROM FLOWFILE
ORDER BY matchScore DESC
LIMIT 1
使用SplitRecord来确保只有一条记录。然后运行EvaluateJsonPath以获取字段作为属性。
在UpdateAttribute中,修剪符号以备不时之需。
${stockSymbol:trim()}
然后,通过InvokeHTTP将该股票代码传递给Twelve Data以获取股票数据。
然后,得到很多股票数据。
{
"meta" : {
"symbol" : "IBM",
"interval" : "1min",
"currency" : "USD",
"exchange_timezone" : "America/New_York",
"exchange" : "NYSE",
"mic_code" : "XNYS",
"type" : "Common Stock"
},
"values" : [ {
"datetime" : "2023-11-15 10:37:00",
"open" : "152.07001",
"high" : "152.08000",
"low" : "151.99500",
"close" : "152.00999",
"volume" : "8525"
}, {
"datetime" : "2023-11-15 10:36:00",
"open" : "152.08501",
"high" : "152.12250",
"low" : "152.08000",
"close" : "152.08501",
"volume" : "15204"
} ...
然后,运行EvaluateJSONPath来获取交换信息。
因为只是为了返回到Slack,分叉记录只得到一条记录。使用UpdateRecord调用其他值来丰富股票数据。然后运行QueryRecord来限制只能发送一条记录到Slack。
SELECT * FROM FLOWFILE
ORDER BY 'datetime' DESC
LIMIT 1
运行EvaluateJsonPath来获取要显示的最多的值字段。
然后,用自己的消息运行PutSlack。
LLM Skipped. Stock Value for ${companyName} [${nlp_org_1}/${stockSymbol}] on ${date} is ${closeStockValue}. stock date ${stockdateT
使用QueryRecord将RSS/XML记录转换为JSON。
然后,运行一个SplitJSON来分解新闻项目。
https://feeds.finance.yahoo.com/rss/2.0/headline?s=${stockSymbol:trim()}®inotallow=US&lang=en-US
使用QueryRecord将RSS/XML记录转换为JSON。
然后,运行一个SplitJSON来分解新闻项目。
运行SplitRecord以限制为1条记录。使用EvaluateJSONPath来获取Slack消息所需的字段。
然后,运行UpdateRecord来完成JSON。
然后,将这个消息发送到Slack。
LLM Skipped. Stock News Information for ${companyName} [${nlp_org_1}/${stockSymbol}] on ${date}
${title} : ${description}.
${guid} article date ${pubdate}
对于那些选择天气的人,对股票采取类似的做法(应该使用Redis@Aiven添加缓存)。使用OpenNLP处理器来提取可能想要了解天气的位置。
下一步是获取处理器的输出并构建要发送到Geoencoder的值。
weatherlocation = ${nlp_location_1:notNull():ifElse(${nlp_location_1}, "New York City")}
如果找不到有效的地点,就说"纽约市"可以用其他的查找方法。正在做一些关于加载所有位置的工作,并且可以在上面做一些高级PostgreSQL搜索-或者可能是OpenSearch或矢量化数据存储。
将该位置传递给Open Meteo,通过InvokeHTTP找到地理位置。
https://geocoding-api.open-meteo.com/v1/search?name=${weatherlocation:trim():urlEncode()}&count=1&language=en&format=json
然后,从结果中解析需要的值。
{
"results" : [ {
"id" : 5128581,
"name" : "New York",
"latitude" : 40.71427,
"longitude" : -74.00597,
"elevation" : 10.0,
"feature_code" : "PPL",
"country_code" : "US",
"admin1_id" : 5128638,
"timezone" : "America/New_York",
"population" : 8175133,
"postcodes" : [ "10001", "10002", "10003", "10004", "10005", "10006", "10007", "10008", "10009", "10010", "10011", "10012", "10013", "10014", "10016", "10017", "10018", "10019", "10020", "10021", "10022", "10023", "10024", "10025", "10026", "10027", "10028", "10029", "10030", "10031", "10032", "10033", "10034", "10035", "10036", "10037", "10038", "10039", "10040", "10041", "10043", "10044", "10045", "10055", "10060", "10065", "10069", "10080", "10081", "10087", "10090", "10101", "10102", "10103", "10104", "10105", "10106", "10107", "10108", "10109", "10110", "10111", "10112", "10113", "10114", "10115", "10116", "10117", "10118", "10119", "10120", "10121", "10122", "10123", "10124", "10125", "10126", "10128", "10129", "10130", "10131", "10132", "10133", "10138", "10150", "10151", "10152", "10153", "10154", "10155", "10156", "10157", "10158", "10159", "10160", "10161", "10162", "10163", "10164", "10165", "10166", "10167", "10168", "10169", "10170", "10171", "10172", "10173", "10174", "10175", "10176", "10177", "10178", "10179", "10185", "10199", "10203", "10211", "10212", "10213", "10242", "10249", "10256", "10258", "10259", "10260", "10261", "10265", "10268", "10269", "10270", "10271", "10272", "10273", "10274", "10275", "10276", "10277", "10278", "10279", "10280", "10281", "10282", "10285", "10286" ],
"country_id" : 6252001,
"country" : "United States",
"admin1" : "New York"
} ],
"generationtime_ms" : 0.92196465
}
然后,对结果进行解析,这样就可以调用另一个API,通过InvokeHTTP获取该纬度和经度的当前天气。
https://api.weather.gov/points/${latitude:trim()},${longitude:trim()}
其结果是geo-json。
{
"@context": [
"https://geojson.org/geojson-ld/geojson-context.jsonld",
{
"@version": "1.1",
"wx": "https://api.weather.gov/ontology#",
"s": "https://schema.org/",
"geo": "http://www.opengis.net/ont/geosparql#",
"unit": "http://codes.wmo.int/common/unit/",
"@vocab": "https://api.weather.gov/ontology#",
"geometry": {
"@id": "s:GeoCoordinates",
"@type": "geo:wktLiteral"
},
"city": "s:addressLocality",
"state": "s:addressRegion",
"distance": {
"@id": "s:Distance",
"@type": "s:QuantitativeValue"
},
"bearing": {
"@type": "s:QuantitativeValue"
},
"value": {
"@id": "s:value"
},
"unitCode": {
"@id": "s:unitCode",
"@type": "@id"
},
"forecastOffice": {
"@type": "@id"
},
"forecastGridData": {
"@type": "@id"
},
"publicZone": {
"@type": "@id"
},
"county": {
"@type": "@id"
}
}
],
"id": "https://api.weather.gov/points/40.7143,-74.006",
"type": "Feature",
"geometry": {
"type": "Point",
"coordinates": [
-74.006,
40.714300000000001
]
},
"properties": {
"@id": "https://api.weather.gov/points/40.7143,-74.006",
"@type": "wx:Point",
"cwa": "OKX",
"forecastOffice": "https://api.weather.gov/offices/OKX",
"gridId": "OKX",
"gridX": 33,
"gridY": 35,
"forecast": "https://api.weather.gov/gridpoints/OKX/33,35/forecast",
"forecastHourly": "https://api.weather.gov/gridpoints/OKX/33,35/forecast/hourly",
"forecastGridData": "https://api.weather.gov/gridpoints/OKX/33,35",
"observationStations": "https://api.weather.gov/gridpoints/OKX/33,35/stations",
"relativeLocation": {
"type": "Feature",
"geometry": {
"type": "Point",
"coordinates": [
-74.0279259,
40.745251000000003
]
},
"properties": {
"city": "Hoboken",
"state": "NJ",
"distance": {
"unitCode": "wmoUnit:m",
"value": 3906.1522008034999
},
"bearing": {
"unitCode": "wmoUnit:degree_(angle)",
"value": 151
}
}
},
"forecastZone": "https://api.weather.gov/zones/forecast/NYZ072",
"county": "https://api.weather.gov/zones/county/NYC061",
"fireWeatherZone": "https://api.weather.gov/zones/fire/NYZ212",
"timeZone": "America/New_York",
"radarStation": "KDIX"
}
}
使用EvaluateJSONPath获取预测URL。
然后,通过invokeHTTP调用预测URL。
这将生成一个更大的JSON输出,将对其进行解析,以便将结果返回给Slack。
{
"@context": [
"https://geojson.org/geojson-ld/geojson-context.jsonld",
{
"@version": "1.1",
"wx": "https://api.weather.gov/ontology#",
"geo": "http://www.opengis.net/ont/geosparql#",
"unit": "http://codes.wmo.int/common/unit/",
"@vocab": "https://api.weather.gov/ontology#"
}
],
"type": "Feature",
"geometry": {
"type": "Polygon",
"coordinates": [
[
[
-74.025095199999996,
40.727052399999998
],
[
-74.0295579,
40.705361699999997
],
[
-74.000948300000005,
40.701977499999998
],
[
-73.996479800000003,
40.723667899999995
],
[
-74.025095199999996,
40.727052399999998
]
]
]
},
"properties": {
"updated": "2023-11-15T14:34:46+00:00",
"units": "us",
"forecastGenerator": "BaselineForecastGenerator",
"generatedAt": "2023-11-15T15:11:39+00:00",
"updateTime": "2023-11-15T14:34:46+00:00",
"validTimes": "2023-11-15T08:00:00+00:00/P7DT17H",
"elevation": {
"unitCode": "wmoUnit:m",
"value": 2.1335999999999999
},
"periods": [
{
"number": 1,
"name": "Today",
"startTime": "2023-11-15T10:00:00-05:00",
"endTime": "2023-11-15T18:00:00-05:00",
"isDaytime": true,
"temperature": 51,
"temperatureUnit": "F",
"temperatureTrend": null,
"probabilityOfPrecipitation": {
"unitCode": "wmoUnit:percent",
"value": null
},
"dewpoint": {
"unitCode": "wmoUnit:degC",
"value": 2.2222222222222223
},
"relativeHumidity": {
"unitCode": "wmoUnit:percent",
"value": 68
},
"windSpeed": "1 to 7 mph",
"windDirection": "SW",
"icon": "https://api.weather.gov/icons/land/day/bkn?size=medium",
"shortForecast": "Partly Sunny",
"detailedForecast": "Partly sunny, with a high near 51. Southwest wind 1 to 7 mph."
},
{
"number": 2,
"name": "Tonight",
"startTime": "2023-11-15T18:00:00-05:00",
"endTime": "2023-11-16T06:00:00-05:00",
"isDaytime": false,
"temperature": 44,
"temperatureUnit": "F",
"temperatureTrend": null,
"probabilityOfPrecipitation": {
"unitCode": "wmoUnit:percent",
"value": null
},
"dewpoint": {
"unitCode": "wmoUnit:degC",
"value": 3.8888888888888888
},
"relativeHumidity": {
"unitCode": "wmoUnit:percent",
"value": 82
},
"windSpeed": "8 mph",
"windDirection": "SW",
"icon": "https://api.weather.gov/icons/land/night/sct?size=medium",
"shortForecast": "Partly Cloudy",
"detailedForecast": "Partly cloudy, with a low around 44. Southwest wind around 8 mph."
},
{
"number": 3,
"name": "Thursday",
"startTime": "2023-11-16T06:00:00-05:00",
"endTime": "2023-11-16T18:00:00-05:00",
"isDaytime": true,
"temperature": 60,
"temperatureUnit": "F",
"temperatureTrend": "falling",
"probabilityOfPrecipitation": {
"unitCode": "wmoUnit:percent",
"value": null
},
"dewpoint": {
"unitCode": "wmoUnit:degC",
"value": 5.5555555555555554
},
"relativeHumidity": {
"unitCode": "wmoUnit:percent",
"value": 82
},
"windSpeed": "6 mph",
"windDirection": "SW",
"icon": "https://api.weather.gov/icons/land/day/few?size=medium",
"shortForecast": "Sunny",
"detailedForecast": "Sunny. High near 60, with temperatures falling to around 58 in the afternoon. Southwest wind around 6 mph."
},
{
"number": 4,
"name": "Thursday Night",
"startTime": "2023-11-16T18:00:00-05:00",
"endTime": "2023-11-17T06:00:00-05:00",
"isDaytime": false,
"temperature": 47,
"temperatureUnit": "F",
"temperatureTrend": null,
"probabilityOfPrecipitation": {
"unitCode": "wmoUnit:percent",
"value": null
},
"dewpoint": {
"unitCode": "wmoUnit:degC",
"value": 6.1111111111111107
},
"relativeHumidity": {
"unitCode": "wmoUnit:percent",
"value": 80
},
"windSpeed": "3 mph",
"windDirection": "SW",
"icon": "https://api.weather.gov/icons/land/night/few?size=medium",
"shortForecast": "Mostly Clear",
"detailedForecast": "Mostly clear, with a low around 47. Southwest wind around 3 mph."
},
{
"number": 5,
"name": "Friday",
"startTime": "2023-11-17T06:00:00-05:00",
"endTime": "2023-11-17T18:00:00-05:00",
"isDaytime": true,
"temperature": 63,
"temperatureUnit": "F",
"temperatureTrend": "falling",
"probabilityOfPrecipitation": {
"unitCode": "wmoUnit:percent",
"value": 20
},
"dewpoint": {
"unitCode": "wmoUnit:degC",
"value": 12.222222222222221
},
"relativeHumidity": {
"unitCode": "wmoUnit:percent",
"value": 86
},
"windSpeed": "2 to 10 mph",
"windDirection": "S",
"icon": "https://api.weather.gov/icons/land/day/bkn/rain,20?size=medium",
"shortForecast": "Partly Sunny then Slight Chance Light Rain",
"detailedForecast": "A slight chance of rain after 1pm. Partly sunny. High near 63, with temperatures falling to around 61 in the afternoon. South wind 2 to 10 mph. Chance of precipitation is 20%."
},
{
"number": 6,
"name": "Friday Night",
"startTime": "2023-11-17T18:00:00-05:00",
"endTime": "2023-11-18T06:00:00-05:00",
"isDaytime": false,
"temperature": 51,
"temperatureUnit": "F",
"temperatureTrend": null,
"probabilityOfPrecipitation": {
"unitCode": "wmoUnit:percent",
"value": 70
},
"dewpoint": {
"unitCode": "wmoUnit:degC",
"value": 12.777777777777779
},
"relativeHumidity": {
"unitCode": "wmoUnit:percent",
"value": 100
},
"windSpeed": "6 to 10 mph",
"windDirection": "SW",
"icon": "https://api.weather.gov/icons/land/night/rain,60/rain,70?size=medium",
"shortForecast": "Light Rain Likely",
"detailedForecast": "Rain likely. Cloudy, with a low around 51. Chance of precipitation is 70%. New rainfall amounts between a quarter and half of an inch possible."
},
{
"number": 7,
"name": "Saturday",
"startTime": "2023-11-18T06:00:00-05:00",
"endTime": "2023-11-18T18:00:00-05:00",
"isDaytime": true,
"temperature": 55,
"temperatureUnit": "F",
"temperatureTrend": null,
"probabilityOfPrecipitation": {
"unitCode": "wmoUnit:percent",
"value": 70
},
"dewpoint": {
"unitCode": "wmoUnit:degC",
"value": 11.111111111111111
},
"relativeHumidity": {
"unitCode": "wmoUnit:percent",
"value": 100
},
"windSpeed": "8 to 18 mph",
"windDirection": "NW",
"icon": "https://api.weather.gov/icons/land/day/rain,70/rain,30?size=medium",
"shortForecast": "Light Rain Likely",
"detailedForecast": "Rain likely before 1pm. Partly sunny, with a high near 55. Chance of precipitation is 70%."
},
{
"number": 8,
"name": "Saturday Night",
"startTime": "2023-11-18T18:00:00-05:00",
"endTime": "2023-11-19T06:00:00-05:00",
"isDaytime": false,
"temperature": 40,
"temperatureUnit": "F",
"temperatureTrend": null,
"probabilityOfPrecipitation": {
"unitCode": "wmoUnit:percent",
"value": null
},
"dewpoint": {
"unitCode": "wmoUnit:degC",
"value": 1.1111111111111112
},
"relativeHumidity": {
"unitCode": "wmoUnit:percent",
"value": 65
},
"windSpeed": "12 to 17 mph",
"windDirection": "NW",
"icon": "https://api.weather.gov/icons/land/night/few?size=medium",
"shortForecast": "Mostly Clear",
"detailedForecast": "Mostly clear, with a low around 40."
},
{
"number": 9,
"name": "Sunday",
"startTime": "2023-11-19T06:00:00-05:00",
"endTime": "2023-11-19T18:00:00-05:00",
"isDaytime": true,
"temperature": 50,
"temperatureUnit": "F",
"temperatureTrend": null,
"probabilityOfPrecipitation": {
"unitCode": "wmoUnit:percent",
"value": null
},
"dewpoint": {
"unitCode": "wmoUnit:degC",
"value": -0.55555555555555558
},
"relativeHumidity": {
"unitCode": "wmoUnit:percent",
"value": 65
},
"windSpeed": "10 to 14 mph",
"windDirection": "W",
"icon": "https://api.weather.gov/icons/land/day/few?size=medium",
"shortForecast": "Sunny",
"detailedForecast": "Sunny, with a high near 50."
},
{
"number": 10,
"name": "Sunday Night",
"startTime": "2023-11-19T18:00:00-05:00",
"endTime": "2023-11-20T06:00:00-05:00",
"isDaytime": false,
"temperature": 38,
"temperatureUnit": "F",
"temperatureTrend": null,
"probabilityOfPrecipitation": {
"unitCode": "wmoUnit:percent",
"value": null
},
"dewpoint": {
"unitCode": "wmoUnit:degC",
"value": -0.55555555555555558
},
"relativeHumidity": {
"unitCode": "wmoUnit:percent",
"value": 67
},
"windSpeed": "13 mph",
"windDirection": "NW",
"icon": "https://api.weather.gov/icons/land/night/few?size=medium",
"shortForecast": "Mostly Clear",
"detailedForecast": "Mostly clear, with a low around 38."
},
{
"number": 11,
"name": "Monday",
"startTime": "2023-11-20T06:00:00-05:00",
"endTime": "2023-11-20T18:00:00-05:00",
"isDaytime": true,
"temperature": 46,
"temperatureUnit": "F",
"temperatureTrend": null,
"probabilityOfPrecipitation": {
"unitCode": "wmoUnit:percent",
"value": null
},
"dewpoint": {
"unitCode": "wmoUnit:degC",
"value": -1.6666666666666667
},
"relativeHumidity": {
"unitCode": "wmoUnit:percent",
"value": 70
},
"windSpeed": "13 mph",
"windDirection": "NW",
"icon": "https://api.weather.gov/icons/land/day/sct?size=medium",
"shortForecast": "Mostly Sunny",
"detailedForecast": "Mostly sunny, with a high near 46."
},
{
"number": 12,
"name": "Monday Night",
"startTime": "2023-11-20T18:00:00-05:00",
"endTime": "2023-11-21T06:00:00-05:00",
"isDaytime": false,
"temperature": 38,
"temperatureUnit": "F",
"temperatureTrend": null,
"probabilityOfPrecipitation": {
"unitCode": "wmoUnit:percent",
"value": null
},
"dewpoint": {
"unitCode": "wmoUnit:degC",
"value": -1.1111111111111112
},
"relativeHumidity": {
"unitCode": "wmoUnit:percent",
"value": 70
},
"windSpeed": "10 mph",
"windDirection": "N",
"icon": "https://api.weather.gov/icons/land/night/sct?size=medium",
"shortForecast": "Partly Cloudy",
"detailedForecast": "Partly cloudy, with a low around 38."
},
{
"number": 13,
"name": "Tuesday",
"startTime": "2023-11-21T06:00:00-05:00",
"endTime": "2023-11-21T18:00:00-05:00",
"isDaytime": true,
"temperature": 49,
"temperatureUnit": "F",
"temperatureTrend": null,
"probabilityOfPrecipitation": {
"unitCode": "wmoUnit:percent",
"value": 30
},
"dewpoint": {
"unitCode": "wmoUnit:degC",
"value": 2.7777777777777777
},
"relativeHumidity": {
"unitCode": "wmoUnit:percent",
"value": 73
},
"windSpeed": "9 to 13 mph",
"windDirection": "E",
"icon": "https://api.weather.gov/icons/land/day/bkn/rain,30?size=medium",
"shortForecast": "Partly Sunny then Chance Light Rain",
"detailedForecast": "A chance of rain after 1pm. Partly sunny, with a high near 49. Chance of precipitation is 30%."
},
{
"number": 14,
"name": "Tuesday Night",
"startTime": "2023-11-21T18:00:00-05:00",
"endTime": "2023-11-22T06:00:00-05:00",
"isDaytime": false,
"temperature": 46,
"temperatureUnit": "F",
"temperatureTrend": null,
"probabilityOfPrecipitation": {
"unitCode": "wmoUnit:percent",
"value": 50
},
"dewpoint": {
"unitCode": "wmoUnit:degC",
"value": 7.7777777777777777
},
"relativeHumidity": {
"unitCode": "wmoUnit:percent",
"value": 86
},
"windSpeed": "13 to 18 mph",
"windDirection": "S",
"icon": "https://api.weather.gov/icons/land/night/rain,50?size=medium",
"shortForecast": "Chance Light Rain",
"detailedForecast": "A chance of rain. Mostly cloudy, with a low around 46. Chance of precipitation is 50%."
}
]
}
}
使用EvaluateJSONPath解析数据,以获取天气的主要字段。
然后,将这些字段格式化为PutSlack。
LLM Skipped. Read forecast on ${date} for ${weatherlocation} @ ${latitude},${longitude}
Used ${forecasturl} ${icon} Temp: ${temperature} ${temperatureunit} - ${temperaturetrend}
There is a wind ${winddirection} at ${windspeed}. ${detailedforecast}
Slack 输出
如果有LLM的问题,需要确保只有一条记录。
使用IBM WatsonX上提供的几种不同的模型。IBM云上的人工智能可以通过REST提示快速访问。
最初在IBM的Prompt Lab测试并构建了这些提示,然后从那里复制了初始的curl语句。
点击https://www.ibm.com/docs/en/watsonx-as-a-service?topic=models-查看IBM watson .ai支持的基础模型。
ibm/mpt-7b-instruct2
meta-llama/llama-2–70b-chat
ibm/granite-13b-chat-v1
在这里必须将唯一的安全密钥发送给IBM,他们将提供一个令牌,以便在下次通话时使用。在解析出问题之后,通过REST API将其发送给WatsonX。
构建一个发送到IBM的提示符,如下所示:
{
"model_id": "meta-llama/llama-2-70b-chat",
"input": "${inputs:urlEncode()}",
"parameters": {
"decoding_method": "greedy",
"max_new_tokens": 200,
"min_new_tokens": 50,
"stop_sequences": [],
"repetition_penalty": 1
},
"project_id": "0ead8ec4-d137-4f9c-8956-50b0da4a7068" }
解析生成的文本,这是生成式人工智能结果加上一些有用的时间元数据。
发布在Slack上的结果如下:
“可以使用Apache NiFi以多种方式集成生成式人工智能模型:
1.数据预处理:在将数据输入生成式人工智能模型之前,使用NiFi对数据进行预处理。这可以包括数据清理、转换和特征工程。
2.模型训练:使用NiFi自动化生成式人工智能模型的训练过程。你可以使用NiFi的PutFile和PutFile_SFTP处理器将训练数据写入文件,然后使用ExecuteScript之类的处理器运行训练脚本。
3.模型部署:一旦生成式人工智能模型得到训练,就可以使用NiFi来部署它。你可以创建一个NiFi流,它接受输入数据,通过生成式人工智能模型运行它,然后输出生成的数据。
4.实时推理:可以使用NiFi的StreamingJobs。
在发布了测试结果后,Slackbot将测试指标和调试信息发布到聊天频道。
所有元数据都被发布到另一个Slack频道,供管理员监控。
==== NiFi to IBM WatsonX.AI LLM Answers
On Date: Wed, 15 Nov 2023 15:43:29 GMT Created: 2023-11-15T15:43:29.248Z
Prompt: Q: What is a good way to integrate Generative AI and Apache NiFi?
Response: )
You can use Apache NiFi to integrate Generative AI models in several ways:
1. Data Preprocessing: Use NiFi to preprocess data before feeding it into your Generative AI model. This can include data cleaning, transformation, and feature engineering.
2. Model Training: Use NiFi to automate the training process of your Generative AI model. You can use NiFi's PutFile and PutFile_SFTP processors to write the training data to a file, and then use a processor like ExecuteScript to run the training script.
3. Model Deployment: Once your Generative AI model is trained, you can use NiFi to deploy it. You can create a NiFi flow that takes in input data, runs it through the Generative AI model, and then outputs the generated data.
4. Real-time Inference: You can use NiFi's StreamingJobs
Token: 200
Req Duration: 8153
HTTP TX ID: 89d71099-da23-4e7e-89f9-4e8f5620c0fb
IBM Msg: This model is a Non-IBM Product governed by a third-party license that may impose use restrictions and other obligations. By using this model you agree to its terms as identified in the following URL. URL: https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/fm-models.html?cnotallow=wx
IBM Msg ID: disclaimer_warning
Model ID: meta-llama/llama-2-70b-chat
Stop Reason: max_tokens
Token Count: 38
TX ID: NGp0djg-c05f740f84f84b7c80f93f9da05aa756
UUID: da0806cb-6133-4bf4-808e-1fbf419c09e3
Corr ID: NGp0djg-c05f740f84f84b7c80f93f9da05aa756
Global TX ID: 20c3a9cf276c38bcdaf26e3c27d0479b
Service Time: 478
Request ID: 03c2726a-dcb6-407f-96f1-f83f20fe9c9c
File Name: 1a3c4386-86d2-4969-805b-37649c16addb
Request Duration: 8153
Request URL: https://us-south.ml.cloud.ibm.com/ml/v1-beta/generation/text?versinotallow=2023-05-29
cf-ray: 82689bfd28e48ce2-EWR
制作自己的Slackbot
Slack输出
Kafka分布
Apache Flink SQL表创建DDL
CREATE TABLE `ssb`.`Meetups`.`watsonairesults` (
`date` VARCHAR(2147483647),
`x_global_transaction_id` VARCHAR(2147483647),
`x_request_id` VARCHAR(2147483647),
`cf_ray` VARCHAR(2147483647),
`inputs` VARCHAR(2147483647),
`created_at` VARCHAR(2147483647),
`stop_reason` VARCHAR(2147483647),
`x_correlation_id` VARCHAR(2147483647),
`x_proxy_upstream_service_time` VARCHAR(2147483647),
`message_id` VARCHAR(2147483647),
`model_id` VARCHAR(2147483647),
`invokehttp_request_duration` VARCHAR(2147483647),
`message` VARCHAR(2147483647),
`uuid` VARCHAR(2147483647),
`generated_text` VARCHAR(2147483647),
`transaction_id` VARCHAR(2147483647),
`tokencount` VARCHAR(2147483647),
`generated_token` VARCHAR(2147483647),
`ts` VARCHAR(2147483647),
`advisoryId` VARCHAR(2147483647),
`eventTimeStamp` TIMESTAMP(3) WITH LOCAL TIME ZONE METADATA FROM 'timestamp',
WATERMARK FOR `eventTimeStamp` AS `eventTimeStamp` - INTERVAL '3' SECOND
) WITH (
'deserialization.failure.policy' = 'ignore_and_log',
'properties.request.timeout.ms' = '120000',
'format' = 'json',
'properties.bootstrap.servers' = 'kafka:9092',
'connector' = 'kafka',
'properties.transaction.timeout.ms' = '900000',
'topic' = 'watsonxaillmanswers',
'scan.startup.mode' = 'group-offsets',
'properties.auto.offset.reset' = 'earliest',
'properties.group.id' = 'watsonxaillmconsumer'
)
CREATE TABLE `ssb`.`Meetups`.`watsonxresults` (
`date` VARCHAR(2147483647),
`x_global_transaction_id` VARCHAR(2147483647),
`x_request_id` VARCHAR(2147483647),
`cf_ray` VARCHAR(2147483647),
`inputs` VARCHAR(2147483647),
`created_at` VARCHAR(2147483647),
`stop_reason` VARCHAR(2147483647),
`x_correlation_id` VARCHAR(2147483647),
`x_proxy_upstream_service_time` VARCHAR(2147483647),
`message_id` VARCHAR(2147483647),
`model_id` VARCHAR(2147483647),
`invokehttp_request_duration` VARCHAR(2147483647),
`message` VARCHAR(2147483647),
`uuid` VARCHAR(2147483647),
`generated_text` VARCHAR(2147483647),
`transaction_id` VARCHAR(2147483647),
`tokencount` VARCHAR(2147483647),
`generated_token` VARCHAR(2147483647),
`ts` VARCHAR(2147483647),
`eventTimeStamp` TIMESTAMP(3) WITH LOCAL TIME ZONE METADATA FROM 'timestamp',
WATERMARK FOR `eventTimeStamp` AS `eventTimeStamp` - INTERVAL '3' SECOND
) WITH (
'deserialization.failure.policy' = 'ignore_and_log',
'properties.request.timeout.ms' = '120000',
'format' = 'json',
'properties.bootstrap.servers' = 'kafka:9092',
'connector' = 'kafka',
'properties.transaction.timeout.ms' = '900000',
'topic' = 'watsonxaillm',
'scan.startup.mode' = 'group-offsets',
'properties.auto.offset.reset' = 'earliest',
'properties.group.id' = 'allwatsonx1'
)
提示例子
{"inputs":"Please answer to the following question. What is the capital of the United States?"}
IBM DB2 SQL
alter table "DB2INST1"."TRAVELADVISORY"
add column "summary" VARCHAR(2048);
-- DB2INST1.TRAVELADVISORY definition
CREATE TABLE "DB2INST1"."TRAVELADVISORY" (
"TITLE" VARCHAR(250 OCTETS) ,
"PUBDATE" VARCHAR(250 OCTETS) ,
"LINK" VARCHAR(250 OCTETS) ,
"GUID" VARCHAR(250 OCTETS) ,
"ADVISORYID" VARCHAR(250 OCTETS) ,
"DOMAIN" VARCHAR(250 OCTETS) ,
"CATEGORY" VARCHAR(4096 OCTETS) ,
"DESCRIPTION" VARCHAR(4096 OCTETS) ,
"UUID" VARCHAR(250 OCTETS) NOT NULL ,
"TS" BIGINT NOT NULL ,
"summary" VARCHAR(2048 OCTETS) )
IN "IBMDB2SAMPLEREL"
ORGANIZE BY ROW;
ALTER TABLE "DB2INST1"."TRAVELADVISORY"
ADD PRIMARY KEY
("UUID")
ENFORCED;
GRANT CONTROL ON TABLE "DB2INST1"."TRAVELADVISORY" TO USER "DB2INST1";
GRANT CONTROL ON INDEX "SYSIBM "."SQL230620142604860" TO USER "DB2INST1";
SELECT "summary", TITLE , ADVISORYID , TS, PUBDATE FROM DB2INST1.TRAVELADVISORY t
WHERE "summary" IS NOT NULL
ORDER BY ts DESC
输出示例
GitHub README
GitHub repo
源代码
Source Code
原文标题:Building a Real-Time Slackbot With Generative AI,作者:Tim Spann